Find pvalue of model in python
Below is my code and I'm looking to find the pvalue for the model that was created, kindly help me with this!
combined_data = combined_data.fillna(0)
predictors_1 = combined_data[['total_cases','new_cases','new_deaths','Transit Mobility 7Day Avg']]
response_1 = combined_data['Reproduction 7Day Avg']
regression_1 = linear_model.LinearRegression()
model_1 = regression_1.fit(predictors_1,response_1)
print('Coefficients: ',model_1.coef_,'Intercept: ',model_1.intercept_,'\nRSquared: ',model_1.score(predictors_1,response_1))
See also questions close to this topic

Problem with calculation od days from date in DataFrame in Python Pandas
I have DataFrame like below:
df = pd.DataFrame({"data" : ["02.01.2020"]}) df["data"] = pd.to_datetime(df["data"])
And list of special dates:
special_date = pd.to_datetime(["04.01.2020", "01.01.2020"], dayfirst=True)
ANd I need to calculate 2 columns in this DataFrame:
col1 = number of days to the next special date
col2 = number of days from the last special date \So I need result like below:
col1 = 2 because next special date from 02.01.2020 will be for 2 days (04.01.2020)
col2 = 1 because last special date from 02.01.2020 was 1 day ago (01.01.2020) 
Use python seaborn to set Heatmap correlations ONLY between certain values
I've got some data for a plastic extruder machine that I am looking for patterns in. I was able to use this answer to get part of the way to a solution by showing correlations over a certain threshold using a seaborn heatmap. Due to the way the machine operates, many of the values I need to analyze are negatively correlated, for example if you increase the speed the extruder operates at, you will decrease the weight of the product made and defining this is of interest to the operators.
What I have so far is from the answer in the link above and works fine for events over the threshold set at kot.
corr = df2.corr() kot = corr[corr>=.8] plt.figure(figsize=(60,40)) sns.heatmap(kot, cmap="Greens")
Can someone help me define it so I could also print correlations that are less than 0.8. It would be really helpful if the same display could have also correlations above +0.8 as well and how I would set kot to so that?
Many thanks.

Replace data in json file with a dataframe
I have a json file in the following format
rawDataBody": { "data": [ { "name": "Test", "unit": "", "format": "integer", "key": "Test" } ], "dataBlock": [ [7, 1730569828, 3490, 1608636960, 30.62, 1003.82, 44.14, 683806.38, 2, 1, 0, 0], [0, 1730563432, 3545, 1608636960, 29.89, 1003.52, 39.25, 557582.38, 2, 1, 0, 0], [1, 1730579048, 3571, 1608636960, 29.79, 1003.45, 41.07, 494566.53, 2, 1, 0, 0], [2, 1730568292, 3595, 1608636960, 29.62, 1003.40, 42.72, 546424.75, 2, 1, 0, 0],
I want to replace the data in the dataBlock with my dataframe and extract the whole in json file
5.0,1730566755.0,22.11,1608636969.0,33.42,1003.5,39.78,60591.71,9.0,1.0,0.0,0.0 6.0,1730551139.0,22.14,1608636969.0,33.27,1003.64,38.77,77906.27,9.0,1.0,0.0,0.0 6.0,1730551139.0,22.14,1608636969.0,33.27,1003.64,38.77,77906.27,9.0,1.0,0.0,0.0 5.0,1730566755.0,42.11,1608636969.0,33.42,1003.5,39.78,60591.71,9.0,1.0,0.0,0.0
The code i have tried is. I dont know why this is not working.can someone help me
with open('data.txt', 'r') as file: jsonData = json.load(file) dfFinalResult = dfFinalResult.values.tolist() for item in jsonData['rawDataBody']['dataBlock']: item = dfFinalResult with open('newjsonfile.txt', 'w') as file: json.dump(jsonData, file)

R lm() working for single row, but not working with forloop
I am trying to run lm() using forloop on a matrix of gene expression values. The dataset is divided among humans and chimps, and I am comparing their relative expression. The original data set can be downloaded using this link. Following are the first 6 rows that I am using in this post,
>matrix Human_AF_8 Human_EU_11 Human_EU_4 Chimpanzee_4 Chimpanzee_6 Chimpanzee_5 ENSG00000000003 0.1394345 0.27961627 0.6147440 0.1857581 0.19963078 0.4290812 ENSG00000000005 0.8167632 0.81676316 0.5223724 2.6947268 0.59724108 0.7190366 ENSG00000000419 0.4391277 2.83122842 0.2066077 0.7903616 0.26222373 0.5113423 ENSG00000000457 1.4025076 0.07813095 0.6768202 1.9199726 0.18687230 1.4537927 ENSG00000000460 0.8636231 0.02775471 1.0507558 0.9997930 0.01413707 0.2064266 ENSG00000000938 1.7407105 0.51450595 0.8887369 1.1291976 0.29129441 0.4344628
After reading and converting the data frame to a matrix (the first column becomes row name), I am trying to fit the lm() function and saving its statistical summary to calculate Pvalue.
Testing one row works fine
# Making reference species < c(rep("human",3), rep("chimp",3)) # Testing for one row works fine species.lm.sum < summary(lm(matrix[1, ] ~ species)) # P value pval < pf(species.lm.sum$fstatistic[1], species.lm.sum$fstatistic[2], species.lm.sum$fstatistic[3], lower.tail = FALSE) # Printing pval
Testing multiple rows give error
p.values < c() for (i in 1:nrow(matrix)) { stat < summary(lm(matrix[i, ] ~ species)) pval < pf(stat$fstatistic[1], stat$fstatistic[2], stat$fstatistic[3], lower.tail = FALSE) p.values < append(p.values, pval) } p.values
Error: Error in
contrasts<
(*tmp*
, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levelsTraceback:
6. stop("contrasts can be applied only to factors with 2 or more levels") 5. `contrasts<`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) 4. model.matrix.default(mt, mf, contrasts) 3. model.matrix(mt, mf, contrasts) 2. lm(exp.meas.mat[i, ] ~ species) 1. summary(lm(exp.meas.mat[i, ] ~ species))
P.S. if trying with the original dataset, change the reference as follows,
species < c(rep("human", 14), rep("chimp", 6))
R markdown chunk
```{r, "LM"} p.values < c() for (i in 1:nrow(data)) { stat < summary(lm(data[i, ] ~ species)) pval < pf(stat$fstatistic[1], stat$fstatistic[2], stat$fstatistic[3], lower.tail=FALSE) p.values < append(p.values, pval) } head(p.values) ```

Pearson's p value approach to test model's significance (pValueCompute.exe)
I have two questions regarding the calculation of p value and success rate, method stated in Pearson et al., 2007.
 Does this testing method work only when crossvalidation is used in maxent setting? can it work if I used bootstrap?
If yes then, I noticed in all of the examples I came cross that the values of the first column (obtained from Minimum training presence test omission) are always 0 or 1 while in my file I have some with decimal value .. so could I say that since the 0 is converted to 1, so the 0.2 in my file will be converted to o.8 ?
Also I’m using the MTP (minimum training presence) threshold. It was not clear for me if I should apply the MTP threshold in maxent settings before running the model or keep it with no threshold?

In multilabel classification, how do we find the P values and confidence interval for the performance evaluation metrics?
I am doing a multilabel classification using two different classifiers with a dataset of 7 labels and 20 features. I have computed the accuracy, sensitivity, specificity, and area under the curve (AUC) metrics. Now, I want to report the P values and confidence intervals for AUC. I got this article, which is applicable for a binary classification problem. But, how can we implement this in multilabel settings?
Kindly provide some suggestions on this?
Thank you in advance!