BEKK model simulation in R
I have been working with a BEKK(1,1) model with dimension 3,4, and 5, for a time series analysis. I was given the feedback to include a simulation study. In order to trust the results that I obtain, I want to, via simulations, show that the estimation of the BEKK model parameters works also well for the sample sizes considered in the paper. I want to show that the distributional theory can be applied for my sample size.
I want to investigate if the sample size is enough to apply the asymptotic results?
Method:
I wish to generate data from the fitted model in the case of the dimension 3, with the sample size 3000. Estimate parameter by fitting BEKK model to the generated data sets and repeat this step, say 10000 times. Then I obtain 10000 estimators for each parameter for which the sampling distribution can be constructed and then it has to be compared with the asymptotic distribution.
Then repeat this procedure for dimension 4 and 5.
#I've been using the mgarchBEKK package when creating my BEKK models.
#The package provides the example below as help for simulation:
## Simulate series:
simulated < simulateBEKK(2, 1000, c(1,1))
## Prepare the matrix:
simulated < do.call(cbind, simulated$eps)
## Estimate with default arguments:
estimated < BEKK(simulated)
I'm not a master in R by any means. So I'm not quite sure how to code the procedure that I describe above.
Any help is greatly apprecieated :)
See also questions close to this topic

Why is .png file not looking clear after arranging using grid.arrange?
I am trying to arrange 9 png files (which I had created using plot() function) in 2 columns. But these files are not looking clear after using grid.arrange to arrange them.
I think if page size is increased, the images will be clearly visible. Or there might be some problem with the size of my plotes converted to .png.
Please suggest me a solution.
Following is the code I am trying:
[install.packages('png') install.packages('gridExtra') library(png) library(grid) library(gridExtra) p1 < readPNG('p1.png') p2 < readPNG('p2.png') p3 < readPNG('p3.png') p4 < readPNG('p4.png') p5 < readPNG('p5.png') p6 < readPNG('p6.png') p7 < readPNG('p7.png') p8 < readPNG('p8.png') p9 < readPNG('p9.png') grid.arrange(rasterGrob(p1), rasterGrob(p2),rasterGrob(p3),rasterGrob(p4),rasterGrob(p5), rasterGrob(p6),rasterGrob(p7),rasterGrob(p8),rasterGrob(p9),ncol=2)][1]

How to use map/reduce and read_csv to import a subset of columns from multiple files
I need to import multiple files into R. From each file I only need a subset of columns. I have working code that will import ALL 97 columns from an arbitrary number of files, but cannot figure out how to limit the import to only the 4 columns I need.
I rely on the tidyverse for many of my functions. R version: 3.6.0
I have created code that imports limited columns from a SINGLE file but cannot expand this to all files.
#this code works correctly and imports all columns from multiple files setwd('R:\\Data\\Raw Data Files') files < c('686960.csv','686962.csv','686963.csv','623056.csv','608910.csv','608911.csv','609248.csv','609246.csv', '722178.csv', '722177.csv', '756542.csv') data < files %>% map(read_csv) %>% reduce(rbind) #this code works correctly and imports 4 columns from 1 file test < test_file %>% read_csv( col_types = cols_only( 'Member ID' = col_character() ,'Patient Age' = col_character() ,'Plan Payment Amount' = col_character() ,'Check Date' = col_character() ) ,n_max = 10 ) #this code produces an error setwd('R:\\Data\\Raw Data Files') files < c('686960.csv','686962.csv','686963.csv','623056.csv','608910.csv','608911.csv','609248.csv','609246.csv', '722178.csv', '722177.csv', '756542.csv') data_test < files %>% map(read_csv( col_types = cols_only( 'Member ID' = col_character() ,'Patient Age' = col_character() ,'Plan Payment Amount' = col_character() ,'Check Date' = col_character() ) ,n_max = 10 )) %>% reduce(rbind) #Error in read_delimited(file, tokenizer, col_names = col_names, col_types = #col_types, : # argument "file" is missing, with no default

How to make sure the user of a shiny app is using the right package versions in R
Due to recent experience with several bugs created by updating packages, I wonder what the best approach is for the following problem:
I currently provide a stand alone version so to say of my shiny App (just the script files to run it locally) and run a long list of require() functions to load / install the needed packages. However, in the end I would like to use fixed package versions to avoid bugs created by changes in packages.
Is there a way to ensure that the user, who may have older or newer versions of packages on their computer, is using the right version of all the packages my app needs?

How to update the model after applying coeftest() for fixing heteroskedasticity with R?
I'm working on a cost model recently, which is a multiple linear regression. After I run the model and check the plots, "ScaleLocation" plot has a trend which is not normal. The R2 is 0.449, the adjusted R2 is 0.444. I realize that there might be a heteroskedasticity based on the graph. enter image description here It does have this problem after I use "NCVtest" and "bptest" to check.
Then, I use the "BoxCoxTrans()" to fix the heteroskedasticity. However, it doesn't apply any transformation.Some answers I find online says the reason is that the "largest/smallest' is too small to apply transformation.(My Largest/Smallest is 6.44, Sample of skeweness is 0.114, estimated Lambda is 0.8)
Thus, I decide to pursue other approach, which is "coeftest()". The coeftest() helps me to reduce the standard error, and I can see the result after I run it. But I don't know how to use the result to replace my old model or generating a new "scalelocation" graph.
(I think the "coeftest()" only returns a summary of the model after applying it, but it doesn't fix(change) the original model.)
what I want is an updated model with the new standard error after using "coeftest()".
I'm sorry for this complex problem and please let me know if there is anything unclear in my statement.
Code for the "coeftest()"
coeftest(rvol1, vcov = vcovHC(rvol1))

Unable to run an anova with nonnumeric factors in 3way factorial
I have data from an experiment where we had three levels. This consisted of Collection 1 or 2, Rate 15, and Variety 13. The response that I'm interested in is Height, which is numeric response with four significant figures. I want to run an anova to see the main effects and interaction effects for the three factors, including the threeway interaction.
I tried the code below, but received the given error. I think that this has to do with the way my factors are set up, but I'm not very strong with coding and cannot figure out what to do to resolve the issue.
lm < aov(Suc~CollectionIrrigationVariety, data=data)
Warning messages:
1: In model.response(mf, "numeric") : using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘’ not meaningful for factors
summary(lm)
Error in levels(x)[x] : only 0's may be mixed with negative subscripts

Can't simulate values extreme enough in Monte Carlo
I have a biological dataset of 576 genes of interest and their assigned functional categories. I also have the assigned functional categories for all of the genes in the genome of the species I am working with. This allows me to set up a weighted, random, draw where I can pick 576 genes/functional assignments from the genome and see what the distributions are for the various functional categories. I have repeated this 1 million times which seems like overkill; however, for one of my categories, I am currently unable to simulate a value as extreme as I have in the genes of interest.
To provide context, the functional category (let's go with "A") represents 14% of the genome and 28% of the genes of interest. The highest value I've simulated is 22.92% for category A, and the 97.5% confidence interval is at 17.19%. This is giving me trouble when I go to calculate the pvalue as I am doing that empirically, i.e. (number of values simulated for A that exceed the value of A for the genes of interest)/(total number of simulations), so my pvalue is just zero.
len([i for i in probs_dict["A"] if i > target_per])/total_sims
Is there a different way to calculate pvalues here or to represent the high degree of overrepresentation?

Autofilter for Time Series in Python/Keras using Conv1d
It may looks like a lot of code, but most of the code is comments or formatting to make it more readable.
Given:
If I define my variable of interest, "sequence", as follows:# define input sequence np.random.seed(988) #make numbers 1 to 100 sequence = np.arange(0,10, dtype=np.float16) #shuffle the numbers sequence = sequence[np.random.permutation(len(sequence))] #augment the sequence with itself sequence = np.tile(sequence,[15]).flatten().transpose() #scale for Relu sequence = (sequence  sequence.min()) / (sequence.max()sequence.min()) sequence # reshape input into [samples, timesteps, features] n_in = len(sequence) sequence = sequence.reshape((1, n_in, 1))
Question:
How do I use the conv1d in an autoencoder in Keras to estimate this sequence with a reasonable level of accuracy?If conv1d is not appropriate for this problem, can you tell me what the more appropriate layertype for the encoderdecoder might be?
More information:
Points about the data: it is a repeating sequence of 10 distinct values
 a single lag of 10 steps should perfectly predict the sequence
 a dictionary of 10 elements should give a "predict the next given this"
I had tried other layers the encoder and decoder portions (LSTM, Dense, multilayer dense) to predict, and they kept hitting a "wall" at around mse of 0.0833... which is the variance of a uniform distribution ranged between 0 and 1. To me, a good autoencoder on this simple of a problem should be able to get at least 99.9% accurate, so a 'mse' substantially below 1%.
I haven't been able to get the conv1d to work because I am messing up the inputs. There seem to be no really great examples of how to make it work, and I am new enough to this overall architecture that it isn't being apparent to me.
Links:

Adding multiple vlines for different dates in timeseries data
I'm trying to plot a line chart with multiple time series data, where each line shows the sales trends over time for for a specific object  within the same start date and end date. My dataset is already in a "melted" form and looks like this:
'data.frame': 468 obs. of 3 variables: $ date : Date, format: "20190411" "20190412" "20190413" ... $ Object : chr "Object1" "Object2" "Object3" "Object 4" ... $ daily_sales: int 1 257 178 177 255 240 231 214 193 174 ...
I have a set of dates for which I need vertical lines, and they're stored in a Date array,
imp.dates
When I try to plot a single
vline
it works fine (with the following code):ggplot(df, aes(x=date, y=daily_sales, colour=Object, group=Object)) + geom_line() + geom_vline(aes(xintercept=imp.dates[1]), linetype=4, colour="black")
However, when I try to do multiple
vlines
ggplot(df, aes(x=date, y=daily_sales, colour=Object, group=Object)) + geom_line() + geom_vline(aes(xintercept=imp.dates), linetype=4, colour="black")
I get the following error:
Error: Aesthetics must be either length 1 or the same as the data (40): xintercept
The following are SO posts that I've looked at to no avail: 1. Multiple vlines in plot gives error, ggplot2 2. ggplot2: how to add text to multiple vertical lines (geom_vlines) on a time xaxis? 3. How to get a vertical geom_vline to an xaxis of class date?
3 comes very close, but my
x
variable is of classDate
and notint
, so can't seem to get it to work.Any help will be appreciated.

implement keras ConvLSTM2D with seq frames to forcast one single value
I try to implement an ConvLSTM2D to use a sequence of low resolution video frames to make a prediction of what will append after. The prediction is not a Frame or a video it is a single values (kind of classification base on what append before) Each frame is 5x3x4 (RGB channels + one grey IR image last) So my input is of shape [samples, timesteps, 5, 3, 4) and output is [samples,1].
I follow this exemple and try to addapt it to my problem with out success https://github.com/kerasteam/keras/blob/master/examples/conv_lstm.py
seq = Sequential() seq.add(ConvLSTM2D(filters=40, kernel_size=(3, 2), input_shape=(timesteps, 5 , 3, 4), padding='same', strides=(2, 1), return_sequences=True)) seq.add(BatchNormalization()) seq.add(Conv3D(filters=40, kernel_size=(2, 2, 2), activation='sigmoid', padding='same', data_format='channels_last')) seq.add(Dense(1, activation='sigmoid')) seq.compile(loss='binary_crossentropy', optimizer='adadelta') seq.fit(compil,yn, batch_size=32, epochs=300, validation_split=0.05)
I got this error message: ValueError: Error when checking target: expected dense_7 to have 5 dimensions, but got array with shape (35040, 1, 1) I try turning "return_sequences=False" and get this one: ValueError: Input 0 is incompatible with layer conv3d_5: expected ndim=5, found ndim=4

Calculating probability using simulation
Let Xi~U(0,30) be random variables. If one takes a sample of n=12, then we are interested in knowing the probability of the proportion to be greater than 18 i.e. $P(\overline X_{12}>18)$
Without using simulation and applied the CLT, the answer is .1150
Here is my approach to implement this in R using simulation:
SimProb<function(N) { n=12 M<matrix(runif(1,0,30),N,n) rowMeann<rowMeans(M) for(i in 1:N) { if(rowMeann[i]>18) c=cumsum(rowMeann[i]) c } prob<1c return(prob) }
The code does not show any error but does not work properly.
When I was checking, it turns out
c= 6.083532
which can't be.For different values of
N
,SimProb
returns only negative values, I don't understand why.Could someone please help me?

Trying to write a function that simulates a single tstatistic from two samples
I am trying to write my own function that simulates a tstat from two samples of equal size.
cntrl < c(8.4,8.1,5.1,7.6,4.7,10.7,5.7,4.1,8.1,6.8) strv < c(12.4,15.8,11.7,8.6,12.6,11.1,10.5,7.3,7.2,10.8)
I put the cntrl into the sample but where do I put strv?
single_rt < function(size1, size2) {x1 <sample(x=cntrl, size =size1, replace = TRUE) (mean(x1)mean(x2))/sqrt((var(x1)/size2) +(var(x2)/size1))} my_rt < function(n, size1, size2) {replicate(n=n, expr = single_rt(size1=size1, size2 = size2))} num_reps < 10000; N1 <10; N2 < 10 t.stats <my_rt(n=num_reps, size1 = N1, size2 = N2)
I get this error code
Error in sample(x = cntrl, size = size1, replace = TRUE)(mean(x1)  mean(x2)) : attempt to apply nonfunction

Who knows to perform FUTURES land use simulation software
I want to perform FUTure UrbanRegional Environment Simulation (FUTURES) software. plz help me to prepare data set and run the model. it qill be great support for me. plz.
I have some knowledge about how works rules in the software.
v.to.rast input=roads type=line where="MTFCC = 'S1630'" output=interchanges use=val r.grow.distance m input=interchanges distance=dist_interchanges
I want to learn about that

GAN, discriminator output only 0 or 1
I'm trying to train SRGAN. (Super Resolution GAN) However, the discriminator's output converge to 0 or 1 whatever the input is. Discriminator's loss function is only
D_loss = 0.5*(D_net(fake) + 1  D_net(real))
D_net(fake)
andD_net(real)
both becomes 0 or 1. (sigmoid)How can I fix it?
for epoch_idx in range(epoch_num): for batch_idx, data in enumerate(data_loader): D_net.zero_grad() #### make real, low, fake real = data[0] for img_idx in range(batch_size): low[img_idx] = trans_low_res(real[img_idx]) fake = G_net(Variable(low).cuda()) #### get Discriminator loss and train Discriminator real_D_out = D_net(Variable(real).cuda()).mean() fake_D_out = D_net(Variable(fake).cuda()).mean() D_loss = 0.5*(fake_D_out + 1  real_D_out) D_loss.backward() D_optim.step() #### train Generator G_net.zero_grad() #### get new fake D out with updated Discriminator fake_D_out = D_net(Variable(fake).cuda()).mean() G_loss = generator_criterion(fake_D_out.cuda(), fake.cuda(), real.cuda()) G_loss.backward() G_optim.step()
Batch : [10/6700] Discriminator_Loss: 0.0860 Generator_Loss : 0.1393
Batch : [20/6700] Discriminator_Loss: 0.0037 Generator_Loss : 0.1282
Batch : [30/6700] Discriminator_Loss: 0.0009 Generator_Loss : 0.0838
Batch : [40/6700] Discriminator_Loss: 0.0002 Generator_Loss : 0.0735
Batch : [50/6700] Discriminator_Loss: 0.0001 Generator_Loss : 0.0648
Batch : [60/6700] Discriminator_Loss: 0.5000 Generator_Loss : 0.0634
Batch : [70/6700] Discriminator_Loss: 0.5000 Generator_Loss : 0.0706
Batch : [80/6700] Discriminator_Loss: 0.5000 Generator_Loss : 0.0691
Batch : [90/6700] Discriminator_Loss: 0.5000 Generator_Loss : 0.0538 ...

Is hyperconverged infrastructure a cloud or physical solution?
I'm doing an IT homework about Hyper Converged Infrastructure and I have read many pages about the subject, but none explains if that concept is a cloud or hardware(in place) based business solution. So... What is Hyper Converged Infrastructure without specialized terms?

Function moisture flux convergence?
I have to compute several thermodynamic indices and MetPy offers plenty of sounding based parameters (e.g. CAPE, CIN, ...). For the evolution of thunderstorms, the vertically integrated moisture flux convergence is a good predictor. Does MetPy offer functions to compute moisture convergence?
Thanks for help and kind regards, Alexander