Why "uniroot()" confuses the input values but "optimize()" doesn't?
I'm wondering why in the following two parallel R code blocks, uniroot()
gives an incorrect answer (i.e., 1e+07
) while optimize()
finds the correct answer (i.e., 336.3954
).
Both blocks of code try to solve for df2
such that y = .15
.
What is the reason behind uniroot
's failure and optimize
's success?
############# Code Block 1 (with uniroot):
alpha = c(.025, .975); df1 = 3; peta = .3 # input values
f < function(alpha, q, df1, df2, ncp){ # Objective function
alpha  suppressWarnings(pf((peta / df1) / ((1  peta)/df2), df1, df2, ncp, lower.tail = FALSE))
}
ncp < function(df2){ # root finding
b < sapply(c(alpha[1], alpha[2]),
function(x) uniroot(f, c(0, 1e7), alpha = x, q = peta, df1 = df1, df2 = df2)[[1]])
b / (b + (df2 + 4))
}
m < function(df2, y){ # A Utility function
abs(abs(diff(ncp(df2)))  y)
}
optimize(m, c(1, 1e7), y = .15)[[1]] # Incorrect answer: 1e+07
############# Code Block 2 (with optimize):
f < function(alpha, q, df1, df2, ncp){ # Objective function
(alpha  suppressWarnings(pf((peta / df1) / ((1  peta)/df2), df1, df2, ncp, lower.tail = FALSE)))^2
}
ncp < function(df2){ # root finding OR optimization
b < sapply(c(alpha[1], alpha[2]),
function(x) optimize(f, c(0, 1e3), alpha = x, q = peta, df1 = df1, df2 = df2)[[1]])
b / (b + (df2 + 4))
}
m < function(df2, y){ # A Utility function
abs(abs(diff(ncp(df2)))  y)
}
optimize(m, c(1, 1e7), y = .15)[[1]] # Correct answer: 336.3954
See also questions close to this topic

Add sequence of dates to data.table (R)
I have a data table that contains locations of places that have recurring events at different frequencies. The date of the last event is provided, as well as how frequently it occurs.
Example:
dt # Location Last_Occurrence Frequency # 1: Home 7192018 30 # 2: School 662018 60 # 3: Moon 151993 90
What I would like to do is add a new column that includes all of the future event dates for each location up through the end of the year 2018.
So, I would like a table that looks something as follows:
dt # Location Last_Occurrence Frequency Next_Dates # 1: Home 7192018 30 7192018 # 2: Home 7192018 30 8182018 # 3: Home 7192018 30 9172018 # 4: Home 7192018 30 10172018 # 5: Home 7192018 30 11162018 # 6: Home 7192018 30 12162018 # 7: School 662018 60 662018 # 8: School 662018 60 852018 # 9: School 662018 60 1042018 etc.
How should I go about doing this? I suspect a lapply function would be useful, since I'm doing this over each location...
I've figured out how to use a "while" loop to generate a vector of future dates:
Last_Sample_Date < Sys.Date() #For testing increase < 5 #For testing NextDate < Last_Sample_Date+increase multiplier < 1 # Create vector of next sampling dates  updated with each iteration of the while loop NextDates < c(Last_Sample_Date, NextDate) while (year(NextDate) == 2018) { multiplier < multiplier+1 NextDate < NextDate+multiplier*increase #Add to vector of next sampling dates NextDates < append(NextDates, NextDate) })
(I realize this actually generates a vector containing the final date in 2019, but I'm OK with that.)
Could I use this while loop somehow, or is there another way I should go about this?

Find rows in a data frame where the text in one column can be found in another column, in R
I want to identify rows in a data frame where the text in one column can be found in another column. For example, in the data frame below, I would like to identify the rows in which the model column contains the text in the gear column (in this case, rows 1, 2, 7, 8, 32).
mydf < cbind.data.frame(model=rownames(mtcars), gear=as.character(mtcars$gear), stringsAsFactors=F) mydf model gear 1 Mazda RX4 4 2 Mazda RX4 Wag 4 3 Datsun 710 4 4 Hornet 4 Drive 3 5 Hornet Sportabout 3 6 Valiant 3 7 Duster 360 3 8 Merc 240D 4 9 Merc 230 4 10 Merc 280 4 11 Merc 280C 4 12 Merc 450SE 3 13 Merc 450SL 3 14 Merc 450SLC 3 15 Cadillac Fleetwood 3 16 Lincoln Continental 3 17 Chrysler Imperial 3 18 Fiat 128 4 19 Honda Civic 4 20 Toyota Corolla 4 21 Toyota Corona 3 22 Dodge Challenger 3 23 AMC Javelin 3 24 Camaro Z28 3 25 Pontiac Firebird 3 26 Fiat X19 4 27 Porsche 9142 5 28 Lotus Europa 5 29 Ford Pantera L 5 30 Ferrari Dino 5 31 Maserati Bora 5 32 Volvo 142E 4
It seems like I should be able to use something like grep or match in combination with something like apply or map, or even ifelse, but I can't quite figure it out. (I could of course do a for loop but I have several million rows of data and would prefer not to.)

R: Changing scale and format of axis labels
I'm trying to put together a graph of data points as a function of time elapsed over the date, but the problem is I have too many data points for the date string size as you can see in the graph below.
I'd prefer if I could have the XAxis show just %Y%m%d instead of the full date and time, but I can't seem to get
scale_x_date
,scale_x_datetime
,xlim
, orxmin
andxmax
to work.Errors I've gotten:
Error: Invalid input: time_trans works with objects of class POSIXct only Error: Invalid input: date_trans works with objects of class Date only
Code I have so far (with failures commented out):
library(ggplot2) library(scales) mydata < read.csv("/Users/user/R/restore_graphs/CSV/store.csv.tmp") restore.df = data.frame( Time = mydata$start, Duration = mydata$time, Labels = gsub(" [09]{1,2}:[09]{1,2}:[09]{1,2}","",mydata$start) ) p < ggplot(restore.df, aes(x=Time,y=Duration)) + geom_point(colour="red") #p < ggplot(restore.df, aes(x=Time,y=Duration)) + geom_point(colour="red") + scale_x_datetime(date_labels = "%Y%m%d %H") #p + scale_x_date(date_labels = "%y%m%d", limits = as.Date('20180614', "%y%m%d"), as.Date('20180620', "%Y%m%d")) #, xlim(as.Date('20180614', "%Y%m%d"), as.Date('20180620', "%Y%m%d")))) + geom_point(colour="red")# + xlim(as.Date('20180614', "%Y%m%d"), as.Date('20180620', "%Y%m%d")) #aes(xmin = as.Date("20180614", "%Y%m%d"), xmax = as.Date("20180620", "%y%m%d"))) # dput(restore.df$Time) print(p)
When I run the line with ggplot changed to:
p < ggplot(restore.df, aes(x=Time,y=Duration,xmin = as.Date("20180614", "%Y%m%d"), xmax = as.Date("20180620", "%y%m%d"))) + geom_point(colour="red")
It changes the graph to have every point shoved to the left of the screen.
Sample data:
uuid,db,table,start,stop,time,size 941439639,test,,"20180614 17:35:07","20180614 17:35:07",62.9666666666667,141329782065 890252165,test,,"20180614 23:35:38","20180614 23:35:38",61.7166666666667,141380294237 943883747,test,,"20180615 05:38:39","20180615 05:38:39",77.7666666666667,141469254934 827384296,test,,"20180615 11:35:11","20180615 11:35:11",63.4166666666667,141276941916 454468935,test,,"20180615 17:35:23","20180615 17:35:23",64.4333333333333,141380122325 705894402,test,,"20180615 23:35:29","20180615 23:35:29",63.9,141715941073 396694772,test,,"20180616 05:39:59","20180616 05:39:59",75.0666666666667,141789270192

How to extract all functions and API calls used in a Python source code?
Let us consider the following Python source code;
def package_data(pkg, roots): data = [] for root in roots: for dirname, _, files in os.walk(os.path.join(pkg, root)): for fname in files: data.append(os.path.relpath(os.path.join(dirname, fname), pkg)) return {pkg: data}
From this source code, I want to extract all the functions and API calls. I found a similar question and solution. I ran the solution given here and it generates the output
[os.walk, data.append]
. But I am looking for the following output[os.walk, os.path.join, data.append, os.path.relpath, os.path.join]
.What I understood after analyzing the following solution code, this can visit the every node before the first bracket and drop rest of the things.
import ast class CallCollector(ast.NodeVisitor): def __init__(self): self.calls = [] self.current = None def visit_Call(self, node): # new call, trace the function expression self.current = '' self.visit(node.func) self.calls.append(self.current) self.current = None def generic_visit(self, node): if self.current is not None: print("warning: {} node in function expression not supported".format( node.__class__.__name__)) super(CallCollector, self).generic_visit(node) # record the func expression def visit_Name(self, node): if self.current is None: return self.current += node.id def visit_Attribute(self, node): if self.current is None: self.generic_visit(node) self.visit(node.value) self.current += '.' + node.attr tree = ast.parse(yoursource) cc = CallCollector() cc.visit(tree) print(cc.calls)
Can anyone please help me to modified this code so that this code can traverse the API calls inside the bracket?
N.B: This can be done using regex in python. But it requires a lot of manual labors to find out the appropriate API calls. So, I am looking something with help of Abstract Syntax Tree.

Passing data between functions, using Main()
I am trying to pass data from one function to another, with in a class and print it. I am unsuccessful and keep getting errors. Error at bottom for help. Thanks in advance.
class Stocks(object): def __init__(self, name): self.name = name print(name) def Get_Data(self): #self.data2 = data2 #print(self.name) data2 = web.get_data_yahoo(self.name,data_source='yahoo',start=start_date,end=end_date)['Adj Close'] #print(data2) #data2.plot(figsize=(10,5)) #plt.show() return data2 def Main(self, Get_Data): x = Stocks(Get_Data()) print(x) #data2.plot(figsize=(10,5)) #plt.show() z = Stocks('GE') z.Get_Data() z.Main() error: TypeError: Main() missing 1 required positional argument: 'Get_Data'  TypeError Traceback (most recent call last) <ipythoninput16291f08c3bdb57> in <module>() 32 z = Stocks('GE') 33 z.Get_Data() > 34 z.Main() TypeError: Main() missing 1 required positional argument: 'Get_Data'

How to call a python class like calling a function
Searching to build up my personal style of programming i want to be able to call a python class the same way i'd call a python function. Here's what i mean: consider this function:
def Factorial(n): if n == 0: return 1 else: return n * Factorial(n  1)
This is a function that outputs 24 when you call Factorial(4).
Now let's consider a class instead:
class Factorial: def __call__(n): if n == 0: return 1 else: return n * Factorial()(n  1)
This code works the same way as the previous code except at call time where you instead write:
Factorial()(4) # which outputs 24
Now my question is how could you do this instead:
Factorial(4) # just that, and output 24, from THE object.
Thanks!

Quantile egual to math expectation
I am trying to find any distribution corresponding to following:
E(X) = x_5, where
F(x_5) <= 0,05 ; F(x_5+0) >= 0,05
Is there any distributions having that speciality?
I tried to find the one among exponential dist., lognorm and so on but i didn`t succeed 
How to calculate distance between the below latitude and longitude points?
A=0N,1W (0 Degree North and 1 Degree West) B=0N,179E (0 Degree North and 179 Degree East) C=90N,0E (90 Degree North and 0 Degree East)
I have to find the distance AB and BC where Radius is 6400 unit.

What is it saved in the model of sklearn bayesian classifier
I believe that a Bayesian classifier is based on statistical model. But after training a Bayesian model, I can save it and do not need the training dataset to predict the test data. For example, if I build a bayesian model by
Can I take the model as a equation like this?
If so, how can I extract the weights and bias? and what is the new formula looks like?If not, what is the new equation like?

Optimizing a benchmark test function using NelderMead algorithm
i wrote a NelderMead algorithm for minimizing a non constrained optimization problemobjective function]1 this test function is very similar to Biggs_EXP5 function except it does not have the power 2. my NM algorithm is getting stuck in the Shrink step and i don't have any idea where and how to select the initial point and how to get near the global minimum. anyone knows what's the problem? i'll appreciate any help from you.

Multiparameter constraint minimization in R
I am trying to figure out if it is possible to use R to perform a minimization with with multiple constraints. Where the constraints do not have a closed form.
In a nutshell, I have a set of random numbers (generated with fixed seeds). I have also defined a function, say f(theta) that will take a vector consisting of model parameters and transform my original set of random numbers into a random sample from the defined distribution.
I am trying to find a set of parameters such that when fed into f will resulting in a sample that will have the 2.5th, 5th, 10th percentile less than or equal to a set targets while minimizing the difference between the target and actual values.

Getting Information about Optimal Solution from Multidimensional Knapsack Algorithm
I am building a multidimensional knapsack algorithm to optimize fantasy NASCAR lineups. I have the code thanks to another author and am now trying to piece back together the drivers the optimal solution consists of. I have written code to do this in the standard case, but am struggling to figure it out with the added dimension. Here's my code:
#open csv file df = pd.read_csv('roster_kentucky_july18.csv') print(df.head()) def knapsack2(n, weight, count, values, weights): dp = [[[0] * (weight + 1) for _ in range(n + 1)] for _ in range(count + 1)] for z in range(1, count + 1): for y in range(1, n + 1): for x in range(weight + 1): if weights[y  1] <= x: dp[z][y][x] = max(dp[z][y  1][x], dp[z  1][y  1][x  weights[y  1]] + values[y  1]) else: dp[z][y][x] = dp[z][y  1][x] return dp[1][1][1] w = 50000 k = 6 values = df['total_pts'] weights = df['cost'] n = len(values) limit_fmt = 'Max value for weight limit {}, item limit {}: {}' print(limit_fmt.format(w, k, knapsack2(n, w, k, values, weights)))
And my output:
Driver total_pts cost 0 A.J. Allmendinger 29.030000 6400 1 Alex Bowman 39.189159 7600 2 Aric Almirola 53.746988 8800 3 Austin Dillon 32.476250 7000 4 B.J. McLeod 14.000000 4700 Max value for weight limit 50000, item limit 6: 325.00072048
I'm looking to at least get the "cost" associated with each "total_pts" in the optimal solution, though it would be nice if I could have it draw out the "Driver" column of the dataframe instead (which I guess could be accessed by indices). Thanks.

Can the Nullspace of the Hessian be Computed Without Computing the Hessian?
I have a class of functions like (yx^2)^2 with equal local minima of 0 along a manifold in the input space (in this case the manifold y=x^2). I would like to follow the valley of minima, but doing so requires a direction of travel (in this case any nonzero constant multiple of < 1, 2x > for an input < x, x^2 >).
Ordinarily, the nullspace of the Jacobian works fine for this purpose (representing the directions where the directional derivative is 0), but since we are starting from a local minimum the Jacobian is identically 0.
When the Jacobian is identically 0 and the Hessian is not, the allowable directions of travel are simply the nullspace of the Hessian. I'm applying this to 100,000+ dimensional problems though, and a direct computation of the Hessian can be considered as too expensive an operation. In my admittedly lowquality computer the Hessian will not even fit in RAM, whereas Jacobian computations have reasonable memory constraints and execute in under a second.
A variety of tricks allow operations with Hessians to be done with lowerorder objects instead. For example, the Hessianvector product H(x)v is approximately equal to a difference of Jacobians (J(x+rv)J(x))/r, and there are tricks with complex numbers or automatic differentiation to do essentially the same computation without the induced numerical error.
Since the nullspace of the Hessian is simply the set of vectors v so that H(x)v is identically 0, this bears a resemblance to problems which don't explicitly require a computation of H(x). Can one use a similar trick to compute the nullspace of the Hessian without direction computing the Hessian?
Update: For bonus points, what about the thirdorder and higher analogues of the Jacobian and Hessian? In the event the Hessian is identically 0 the nullspace of a higherordered space must be found instead.

EXCEL VBA: Dot Product using Arrays
Below is example code which is an excerpt from a larger whole. I am attempting to compute the dot product of vectors
beta
andXtempj
which should be a scalar and then to multiple the resulting scalar by another scalar,Ycoded(j,1)
.However, I am receiving an error message "Type mismatch" during the assignment statement for
temp1(j,1)
.Option Explicit
Sub XX() Dim beta As Variant Dim temp1 As Variant Dim X5 As Variant Dim Xtempj As Variant Dim Ycoded As Variant ReDim beta(1 To 2, 1 To 1) ReDim X5(1 To 2, 1 To 2) ReDim temp1(1 To 2, 1 To 1) ReDim Xtempj(1 To 2, 1 To 1) ReDim Ycoded(1 To 2, 1 To 1) beta(1, 1) = 0.510825624 beta(2, 1) = 0 X5(1, 1) = 1 X5(1, 2) = 45 X5(2, 1) = 1 X5(2, 2) = 76 Ycoded(1, 1) = 1 Ycoded(2, 1) = 0 For j = 1 To 2 For k = 1 To 2 Xtempj(k, 1) = X5(j, k) Next k temp1(j, 1) = WorksheetFunction.MMult(Application.Transpose(beta), Xtempj) * Ycoded(j, 1) Next j End Sub
This error message makes me think that VBA is thinking of
Ycoded(j,1)
as a 1 x 1 array. Therefore, I also tried the following statement:temp1(j, 1) = WorksheetFunction.MMult(WorksheetFunction.MMult(Application.Transpose(beta), Xtempj), Ycoded(j, 1))
However, here I receive the "Unable to get the MMult property of the WorksheetFunction class".
I can do this kind of thing in R or SAS Proc IML in my sleep, so this is VERY frustrating. Any assistance/insight is appreciated.
Best,
Dan

The smartest way to convert a half vectorization to full matrix back
I've a question regarding to the question mentioned above. I've a large data set. Each row corresponds to one day and the matrix is recorded in a vech form, i.e., by vectorizing the unique upper triangle of the matrix and transposing it into a row vector.
I want to transform the whole data set back into the full matrices of each day (observation). A short reproducible example is below:
structure(list(X01 = c(5.89378246085557, 5.75461814891448, 8.01511372310818 ), X02 = c(2.04749233527123, 1.79201580489132, 4.13243690125304 ), X03 = c(6.84437620663595, 7.76572007184568, 9.21387189085179 ), X04 = c(1.48894672990183, 1.412996838366, 2.60650888282447 ), X05 = c(0.951513482112949, 1.37836898031636, 2.74942660284063 ), X06 = c(2.68732004256996, 2.70518391829012, 4.74436162847904 ), X07 = c(2.15270455626705, 2.47115157067303, 3.92259973345368 ), X08 = c(2.12319802206402, 3.08674733009856, 3.91968874234002 ), X09 = c(1.92541015217767, 2.62688861593519, 2.84482322630633 ), X10 = c(10.2251876218029, 9.5296229460776, 8.46917045978735 ), X11 = c(1.23017267644711, 1.1675692778204, 1.93502656632884 ), X12 = c(1.24956978625185, 1.78431799528065, 2.81675019026563 ), X13 = c(1.07497786235713, 0.422607699395901, 1.51342480871545 ), X14 = c(1.36996532434845, 1.85499779637815, 1.49126581139642 ), X15 = c(6.33847251476969, 4.54434019245843, 7.27901008329251 ), X16 = c(2.08364028735932, 1.74263661965122, 2.25975717022752 ), X17 = c(1.24649025820314, 1.95698292727337, 3.12139710484827 ), X18 = c(0.647824716200822, 0.805958808007548, 2.33918923838555 ), X19 = c(2.05060707165895, 2.06986549088027, 1.99435629106657 ), X20 = c(0.655024785094781, 1.22421902352593, 0.811896637188255 ), X21 = c(4.20465438339735, 3.4827652599631, 7.63429180588341 )), row.names = c(NA, 3L), class = "data.frame")
the example contains the first the rows, so three Matrices should be the expected output