R programming for linear model
model2<lm(formula = Losses.in.Thousands~Age, Years.of.Experience,Gender, Married, data = default)
Error in model.frame.default(formula = Losses.in.Thousands ~ Age, data = default, : object 'Married' not found
See also questions close to this topic

How can I use foreach function well?
I would like to use
foreach()
function in R.Here's my example code.
library(randomForestSRC) library(dplyr) library(ROCR) library(doParallel) data(pbc, package="randomForestSRC") data_na < na.omit(pbc) data_na < data_na %>% dplyr::select(days) foreach(VAR=age) %do% { data_na < data_na %>% mutate(Q4 = ifelse(data_na[,"VAR"]<=unname(quantile(data_na[,"VAR"], 0.25)), 0, ifelse(data_na[,"VAR"]<=unname(quantile(data_na[,"VAR"], 0.50)), 1, ifelse(data_na[,"VAR"]<=unname(quantile(data_na[,"VAR"], 0.75)), 2, 3)))) }
Without modifying the whole code, I want to change the code
foreach(VAR=age) or foreach(VAR=bili)..
. etc.But in the error message, this code consider
"age"
as an object.How can I run this code without error?

How to include a GIF in rshiny
I try to include a
GIF
from my local folder to anrshiny
app (runtime:shiny
) yet I have failed to find any solution or workaround and any help is much appreciated.The GIF itself is a series of
ggplot
s created within a loop and saved viaanimation::saveGIF()
.A potential workaround would be to run the loop within the app an create the
GIF
onthefly, but this is not working in shiny either.Thank you in advance!

How to expand the function of seq() to list or dataframe?
So now I have a list look like this:
$a [1] 2 $b [1] 5 $c [1] 3
Just wonder if there's a way to make them become:
$a [1] 1 2 $b [1] 1 2 3 4 5 $c [1] 1 2 3
Thank you very much for your answer!!

Logistic regression in matlab using mnrfit
I'm trying to use the
mnrfit
function but I get the errorIf Y is a column vector, it must contain positive integer category numbers.
.My data is in a double and my Y values are floats, e.g. 0.6667. Is there a way that I can adjust my data to be able to use the mnrfit function?
Thanks in advance! An unexperienced beginner

How to set a range of predicted value in regression based on the actual value?
I am using regression to predict the price of a product. But I want all predicted value should not be 10 more or 10 less than actual value. For example if actual value is 20 the predicted value should be between 10 and 30.
I am not focusing on RMSE or other matrix but this should follow above condition.
How to do that using any regression technique. Looking for suggestion.

Really huge hazard ratio with PHREG competing risk modelerror?
I am modeling a competing risk model of disease progression with death as a competing risk (0=censor, 1=progression, 2=death without progression). I am using SAS's
PHREG
code as below, and it's giving me one variable with a huge hazard ratio. All the independent variables are binary yes/no. Anyone know why the huge hazard ratio? At the bottom I attached a screenshot of the model output, and also a separate screenshot with the # of events/competing events when theLIFETEST
procedure is run withfluoro
as thestrata
argument.proc phreg data=dataset plots(overlay=stratum)=cif; class cephalo; class an_coverage; class fluoro; class vancomycin_taken; model fu_time_prog*prog_cr(0)= vancomycin_taken an_coverage fluoro cephalo / eventcode=1; run;

Bayesian Logistic Regression
I am trying to estimate a Bayesian Logistic Regression. I'm trying to specify my generic model follows and I am receiving an error regarding my '{' that I do not understand. Can someone help me understand why this is?
This is the Error message I am receiving:
Error: unexpected '}' in "}"
logistic_model < model{ for(i in 1:n){ logit(q[i]) < beta[1] + beta[2]*X[i,1] + beta[3]*X[i,2] + beta[4]*X[i,3] + beta[5]*X[i,4] + beta[6]*X[i,5] Y[i] ~ dbern(q[i]) } for(j in 1:6){ beta[j] ~ dnorm(0,0.1) } }

Train a logistic regression model in parts for big data
My data set consists of 1.6 million rows and 17000 columns after preprocessing. I want to use logistic regression on this data, however the process gets killed everytime I load the dataset. Is there a way I can train a logistic regression model in chunks, wit the coefficients being updated at each iteration. Does sklearn support any technique for my problem?

Why is the accuracy of the logistic regression classifier different from knearest neighbors?
I understand how to compute accuracy for each but I don't understand why they are different.

How to set upper and lower bounds for each element in a set?
I am creating a GAMS model to solve a simple maximization problem. I have a set J with 3 elements (1,2,3) and a variable x(J) that encompasses all the elements.
I am wondering if there is a way in GAMS to set a lower bound of 0 and upper bound of 3 to each element in the set without having to set each element bound individually and without using the positive variable keyword for the lower bound.
I have tried using x.lo =e= 0 and x.up =e= 3 but none of these are working. I am guessing I am not using the correct syntax but for the life of me cannot seem to find anything on the official documentation about it specifically for sets.
What is the correct way of doing this?

How to add constraints to Linear Programming of the Knapsack Problem in R?
I was working through the code found at: https://sites.math.washington.edu/~conroy/2015/m381aut2015/Rexamples/knapsack.r
I was wondering if anyone knows how to add a conditional constraint that only allows for a certain number of items in the knapsack. How would I modify the code to still optimize the value of the knapsack but only take a certain number of items?
# import the lpsolve library library(lpSolve) # objective function knapsack.obj < c(500,300,100,210,360,180,220,140,90) #constraints knapsack.con < matrix(c(30,35,10,15,35,22,29,18,11),nrow=1,byrow=TRUE) knapsack.dir < c("<=") knapsack.rhs < c(100) #solve # Note when we call the lp function, we set all.bin=TRUE to indicate that all variables are 0 or 1 # If we just wanted to specify integer values generally, we would set all.int=TRUE # The default for both of these options if FALSE knapsackSolution < lp("max",knapsack.obj,knapsack.con,knapsack.dir,knapsack.rhs,all.bin=TRUE) print("Solution is:") print(knapsackSolution$solution) print("Objective function value at solution is:") print(knapsackSolution$objval)

Clustering while trying to minimise spare capacity
I am trying to cluster ~30 million points (x and y coordinates) into clusters  the addition that makes it challenging is I am trying to minimise the spare capacity of each cluster.
Each cluster is made from equipment that can serve 64 points, if a cluster contains less than 65 points then we need one of these pieces of equipment. However if a cluster contains 65 points then we need two of these pieces of equipment, this means we have a spare capacity of 63 for that cluster.
Ultimately I am trying to minimise the number of pieces of equipment which seems to be an equivalent problem to minimising the average spare capacity which is also a nice way of visualising the data to me.
I have tried multiple approaches:
 Kmeans
 Most should know how this works
 Average spare capacity of 32
 Runs in O(n^2)
 Sorted list of ab distances
 I tried an alternative approach like so:
 Initialise cluster points by randomly selecting points from the data
 Determine the distance matrix between every point and every cluster
 Flatten it into a list
 Sort the list
 Go from smallest to longest distance assigning points to clusters
 Assign clusters points until they reach 64, then no more can be assigned
 Stop iterating through the list once all points have been assigned
 Update the cluster centroid based on the assigned points
 Repeat steps 1  7 until the cluster locations converge (as in Kmeans)
 Collect cluster locations that are nearby into one cluster
 This had an average spare capacity of approximately 0, by design
 This worked well for my test data set, but as soon as I expanded to the full set (30 million points) it took far too long, probably because we have to sort the full list
O(NlogN)
and then iterate over it until all points have been assignedO(NK)
and then repeat that until convergence
 I tried an alternative approach like so:
 Linear Programming
 This was quite simple to implement using libraries, but also took far too long again because of the complexity
I am open to any suggestions on possible algorithms/languages best suited to do this. I have experience with machine learning, but couldn't think of an obvious way of doing this using that.
Let me know if I missed any information out.
 Kmeans