how can I calculate mean of every 6 rows in a data frame based on the condition of other column?
I have a dataframe in R, contain 3000 rows and 2 columns (temp, flag), I am trying to calculate "meantemp" as the third column for every 6 rows if the corresponding values in flag column are not NA, SO if flage values are equal to NA I want to show in the meantemp column NA value. sorry for this question.
1 answer

You can try :
library(dplyr) df %>% group_by(group = ceiling(row_number()/6)) %>% mutate(meantemp = if(all(is.na(flag))) mean(temp, na.rm = TRUE) else NA) %>% ungroup() %>% select(group) > df
See also questions close to this topic

How to fill a shapefile with holes
Currently, I am processing spatial data in R. https://rstudiopubsstatic.s3.amazonaws.com/259089_2f5213f21003443994b28aab0a54cfd6.html
I created a time trade area by the method described on the above page. However, the shapefile I created is perforated and incomplete. I want you to tell me how to fill this hole.
https://drive.google.com/file/d/10wJxptWxs59MB9KZVTsrs0biFYN9pOa/view?usp=sharing
Thank you for your cooperation.

R Distill  Custom Listing By Category
I'd like to create a custom listing page. The help docs include the advice to create a separate file with the following YAML header:
 title: "Gallery of featured posts" listing: posts:  20161108sharperatio  20171109visualizingassetreturns  20170913assetvolatility 
But I'd like to subset posts by category, rather than by listing specific posts. Is there a way to do this?
Thank you!

Searching for the Right AR(2) Seed Like I Did for AR(1) in R
Using an
arima.sim()
function to simulate time series data that follows a particularARIMA
model requires a lot of trials (especially for a ridiculously small sample size) of changingset.seed
value like the example bellow:library(forecast) set.seed(1) ar1 < arima.sim(n = 15, model=list(ar=0.2, order = c(1, 0, 0)), sd = 1) (ar2 < auto.arima(ar1, ic ="aicc")) set.seed(3) ar1 < arima.sim(n = 15, model=list(ar=0.2, order = c(1, 0, 0)), sd = 1) (ar2 < auto.arima(ar1, ic ="aicc")) set.seed(3) ar1 < arima.sim(n = 15, model=list(ar=0.2, order = c(1, 0, 0)), sd = 1) (ar2 < auto.arima(ar1, ic ="aicc"))
until I get my desired result I have found an
R
code that prints out the right seed to set forAR(1)
:library(future.apply) FUN < function(i) { set.seed(i) ar1 < arima.sim(n=15, model=list(ar=0.6, order=c(1, 0, 0)), sd=1) ar2 < auto.arima(ar1, ic="aicc") (cf < ar2$coef) if (length(cf) == 0) { rep(NA, 2) } else if (all(grepl(c("ar1intercept"), names(cf))) & substr(cf["ar1"], 1, 6) %in% "0.6000") { c(cf, seed=i) } else { rep(NA, 2) } } R < 1e4 seedv < 1:R library(parallel) cl < makeCluster(detectCores()  1 + 1) clusterExport(cl, c("FUN"), envir=environment()) clusterEvalQ(cl, suppressPackageStartupMessages(library(forecast))) res < parLapply(cl, seedv, "FUN") (res1 < res[!sapply(res, anyNA)]) stopCluster(cl) res2 < Reduce(function(...) merge(..., all=T), lapply(res1, function(x) as.data.frame(t(x)))) res2[order(res2$seed), ]
which produces this result:
#ar1 seed #1 0.8000417 4195
which when I check as bellow:
ar1 < arima.sim(n=15, model=list(ar=0.6, order=c(1, 0, 0)), sd=1) (ar2 < forecast::auto.arima(ar1, ic="aicc"))
it will be true
Series: ar1 ARIMA(1,0,0) with zero mean Coefficients: ar1 0.6000 s.e. 0.1923
sigma^2 estimated as 2.051: log likelihood=26.38 AIC=56.75 AICc=57.75 BIC=58.17
What I want
I want a case of
AR(2)
Bellow is my trial:FUN < function(i) { set.seed(i) ar1 < arima.sim(n=15, model=list(ar=c(0.9, 0.9), order=c(2, 0, 0)), sd=1) ar2 < auto.arima(ar1, ic="aicc") (cf < ar2$coef) if (length(cf) == 0) { rep(NA, 2) } else if (all(grepl(c("ar1ar2intercept"), names(cf))) & substr(cf["ar1"], 1, 5) %in% "0.90" & substr(cf["ar2"], 1, 5) %in% "0.90") { c(cf, seed=i) } else { rep(NA, 2) } } R < 2e5 seedv < 1e5:R library(parallel) cl < makeCluster(detectCores()  1 + 1) clusterExport(cl, c("FUN"), envir=environment()) clusterEvalQ(cl, suppressPackageStartupMessages(library(forecast))) res < parLapply(cl, seedv, "FUN") (res1 < res[!sapply(res, anyNA)]) stopCluster(cl) res2 < Reduce(function(...) merge(..., all=T), lapply(res1, function(x) as.data.frame(t(x)))) res2[order(res2$seed), ]
The search stops here ###################################################################### I confirm my result as bellow:
set.seed(1509) ar1 < arima.sim(n=20, model=list(ar=c(0.9, 0.9), order=c(2, 0, 0)), sd=1) library(forecast) (ar2 <auto.arima(ar1, ic="aicc"))
My output on coefficients of AR(2) are not exactly
0.90, 0.90
:Series: ar1 ARIMA(2,0,0) with zero mean
Coefficients: ar1 ar2 0.9369 0.8997 s.e. 0.0940 0.0752
sigma^2 estimated as 0.9048: log likelihood=28.12 AIC=62.24 AICc=63.74 BIC=65.23

How to convert pandas.core.series.Series object with Multiple index to a pandas Dataframe?
I Have one pandas series with multiple index like this image "target","Lastnewjob", "experienceGroup". this is pandas.core.series.series type. I want to convert it to a dataframe(second image) where "experienceGroup" values will be column names and "target","Lastnewjob" remains as columns.
Dataframe that I want to see
Code to get the series by using groupby.
Job=df.groupby(['target','last_new_job'])['experienceGroup'].value_counts() Job.unstack()
 series
I have tried this one
data=pd.Series.to_frame(Job)
but no columns of experience group was present. The last Image. 
How to apply autoregression in python
I am trying to fit an autoregressive model to more than one data history and get a unique model that represents all different data history. The typical examples I saw online usually only apply just one data history. As a minimum, my research entails that I fit autoregression to atleast 3 data history.
I really would appreciate your help.

How can I use split() in a string when broadcasting a dataframe's column?
Take the following dataframe:
df = pd.DataFrame({'col_1':[0, 1], 'col_2':['here 123', 'here 456']})
Result:
col_1 col_2 0 0 here 123 1 1 here 456
I need to create a 3rd column (broadcasting), using a condition on
col_1
, and splitting the string oncol_2
. This is ok to do:df['col_3'] = float('NaN')
df.loc[df['col_1'] == 1, ['col_3']] = df['col_2'].str.slice(5, 8)
Result:
col_1 col_2 col_3 0 0 here 123 NaN 1 1 here 456 456
But I need to specify dynamic indexes to split the string on
col_2
, instead of (5, 8).When I try to run the following code it does not work, because
df['col_2']
is treated as aSeries
:df.loc[df['col_1'] == 1, ['col_3']] = df['col_2'].split(' ')[0]
I'm spending a huge time trying to solve this without needing to iterate the dataframe.

Grouping documents by week/month/year using Node js
I am currently developing a cash register app. I am using a modelcontrollerviewrepository design. I have a cart model that holds products, and when a sale is validated, it clones the cart and adds a date. My question is: how can I get all the sales done in a specific period of time in a given year. Example of what I want: «http://localhost:PORT/sales/month/3» would get all sales done in march, « http://localhost:PORT/sales/month/3/week/1 » would get all sales done in the first week of march etc... I tried several things but I am still lost in MEAN stack. Thanks, I hope my post is not too confusing

Question about R, How can take the mean of each category?
I am trying to write a code to count how many times each category is named and then take the mean of each category.
How would I go about this? A for loop? a if else ? a function?So I want to write a code that counts every time I can see ''Location''. It stores and then tells me there is 4 within this category. Also I want to take the mean. so 4 divided by 25 = .16
"location" "Masculinity" "ownership" "Masculinity" "difference" "agency" "agency" "Feminality" "ownership" "Feminality" "ownership" "location" "agency" "Masculinity" "difference" "location" "Feminality" "ownership" "agency" "Masculinity" "difference" "difference" "Feminality" "location" "Masculinity"
Thank you.

How to calculate the mean value for each row name and each year in a dataframe in R Studio
My data frame looks like this: first column is a kind of loop (high, low, mid), second column are years from 1981 to 2020 and the third column are values.As you see, for each YEAR I have three values for "high", three values for "low" and three values for "mid". I wanna calculate THE MEAN of those three values in order to have only ONE value for "high", one value for "low" and one value for "mid" and do the same for EACH year.
Suelo= (c("low","mid","high","low","mid","high","low","mid","high","low","mid","high","low","mid","high","low","mid","high")) Years= (c("1981","1981","1981","1981","1981","1981","1981","1981","1981","1982","1982","1982","1982","1982","1982","1982","1982","1982")) Value=(c("453","4543","3459","3434","34534","333","453","223","377","976","34534","33456","453","54643","45659","33454","34924","23213")) DF=data.frame(Suelo,Years,Value)
What I want to get is a data frame that looks like this (Values were invented not calculated) :
Suelo= (c("low","mid","high","low","mid","high")) Years= (c("1981","1981","1981","1982","1982","1982")) Value=(c("453","4543","3459","3434","34534","333")) DF=data.frame(Suelo,Years,Value)