bnlearn::bn.fit difference and calculation of methods "mle" and "bayes"
I try to understand the differences between the two methods bayes
and mle
in the bn.fit
function of the package bnlearn
.
I know about the debate between the frequentist and the bayesian approach on understanding probabilities. On a theoretical level I suppose the maximum likelihood estimate mle
is a simple frequentist approach setting the relative frequencies as the probability. But what calculations are done to get the bayes
estimate? I already checked out the bnlearn documenation, the description of the bn.fit function and some application examples, but nowhere there's a real description of what's happening.
I also tried to understand the function in R by first checking out bnlearn::bn.fit
, leading to bnlearn:::bn.fit.backend
, leading to bnlearn:::smartSapply
but then I got stuck.
Some help would be really appreciated as I use the package for academic work and therefore I should be able to explain what happens.
See also questions close to this topic

Create date (month) exeption for R script
I am trying to access and download some data from a website with API connection. The challenge is that the instrument I want to trace is a future with the usual convention (which is that each month corresponds to a letter, followed by the year). The instrument root name is JKM, the structure is JKMMYY (M for month, Y for year). Keep in mind that the month is represented by a letter.So, for instance the contract for January19 would be JKMF19 (because F represents January). So far so good.. I applied the following logic:
monthsymbol < c("F", "G", "H", "J", "K", "M", "N", "Q", "U", "V", "X", "Z")
and then applied monthsymbol[month(Sys.Date())+1])
This works well unless I am in December, where I need to apply the first month of the next year. I am not quite sure what is the best way to implement that, maybe an if clause? Any input is welcome!
I add the code below, so you get the idea..
Many thanks in advance!
library("lubridate") library("jsonlite") library("dplyr") library("xts") library("dygraphs") library("rvest") ########################## #JKM historical data #### apikey < "a35e82bda46a34f21xxxxxxxxxxxxxxxxxxx" current_year < unlist(strsplit(x = as.character(year(Sys.Date())),split = "0"))[2] todays_date < gsub(x = as.character(Sys.Date()), pattern = "", replacement = "") #api in json format json_file < paste0("https://marketdata.websol.barchart.com/getHistory.json?apikey=",apikey,paste0("&symbol=JKM", monthsymbol[month(Sys.Date())+1]),current_year,"&type=daily&startDate=20160901&endDate=",todays_date,"&maxRecords=750")

R/Shiny dynamically generated sliders don't generate unique values on separate tabs
I could use some help solving an issue I have with my Shiny code. I have an app with multiple tabs for different kinds of time series analyses. Each analysis uses the same dataset, which is loaded from a .csv file by the user. The user can select which variables (columns) they want to use in the analysis. On each tab, I want to do the following:
 1) render sliders based on the variables selected (this is done and is working)
 have Shiny create a table for a specified forecast period (i.e. 12 months) that the user defines, and populate it with the values from the sliders (this is done and is working)
 the sliders on each tab should work independently, and generate a unique table of numbers  this is not working!! and driving me nuts!.
The problem is that one set of sliders generates the same values for all tables on every tab (there are 5 tabs in the app that run analyses). I suspect it is because I'm using tagList, but I'm not advanced enough yet to know what I'm doing wrong. The code below is replicated on each tab, only the reactive function names have been changed.
#This creates sliders from the selected variables from the reactive function called #"df_sel()"  which is just a function that allows the user to select which variables they want from the original .csv file. Note the use of "tagList". The RenderUI function below creates as #many sliders as variables selected, and passes in the correct variable name. #It selects the last data value from each column, since this is time series data, #the last data value (most recent) was desired. output$scplan < renderUI({ vars < df_sel() n = nrow(vars) tagList(lapply(colnames(vars), function(z) { sliderInput( sprintf("%s",z), label = z, min = ceiling(min(vars[[z]])), # min value is the minimum of the column max = ceiling(max(vars[[z]])), # max is the max of the column value = vars[[z]][[n]]) })) #this reactive function creates a dataframe from the sliders above #with the values the user selected, and the code #repeats the values for as many "input$months" as were selected. sp_numbers < reactive({ vars < df_sel() num = as.integer(ncol(vars)) sp_pred < data.frame(lapply(colnames(vars), function(z) { input[[z]] })) names(sp_pred) < colnames(vars) n = input$sp_months df_sp_pred = data.frame(sp_pred, z=rep(z:n,ea=NROW(input$sp_months))) df_sp_pred[1:(length(df_sp_pred)1)] #this removes the last column which just shows the repeat count }) #this code renders the table of the dataframe created above. output$spo_table < renderTable({ sp_numbers() })
when I put this code on a new tab, I need it the sliders to generate unique values. right now, the table is populated with whatever the sliders on the first tab were set to.
PS  I know that the custom is to provide code that can be run, but for me to do that would require pasting in a significant and long amount of code; I thought this would be easier.
Also, if there is an easier way to accomplish what's going on in this code, by all means please edit.
Thank you!

How to fix the intercept in a binomial GLM model?
Looking for some help fitting a binomial model with a fixed intercept. I've seen this question covered successfully with linear models by subtracting the explicit intercept from the regression and and then fit the interceptfree model:
intercept < 1.0 fit < lm(I(x  intercept) ~ 0 + y, lin) summary(fit)
Or by specifying an offset:
m(x ~ y +0 + offset(rep(1, nrow(lin))),
But I can't figure out how to do this with a binomial regression. Example code below:
library(ggplot2) library(lme4) library(dplyr) library(gather) # simulate dummy dataframe x < data.frame(time = c(1, 1, 1, 1, 1, 1,1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2,3, 3, 3, 3, 3, 3, 3, 3, 3,4, 4, 4, 4, 4, 4, 4, 4, 4), type = c('a', 'a', 'a', 'b', 'b', 'b','c','c','c','a', 'a', 'a', 'b', 'b', 'b','c','c','c','a', 'a', 'a', 'b', 'b', 'b','c','c','c','a', 'a', 'a', 'b', 'b', 'b','c','c','c'), randef=c('aa','ab','ac','ba','bb','bc','ca','cb','cc','aa','ab','ac','ba','bb','bc','ca','cb','cc','aa','ab','ac','ba','bb','bc','ca','cb','cc','aa','ab','ac','ba','bb','bc','ca','cb','cc'), surv = sample(x = 1:200, size = 36, replace = TRUE), nonsurv= sample(x = 1:200, size = 36, replace = TRUE)) # convert to survival and non survival into individuals following # https://stackoverflow.com/questions/51519690/convertcbindformatfor binomialglminrtoadataframewithindividualrows x_long < x %>% gather(code, count, surv, nonsurv) # function to repeat a data.frame x_df < function(x, n){ do.call('rbind', replicate(n, x, simplify = FALSE)) } # loop through each row and then rbind together x_full < do.call('rbind', lapply(1:nrow(x_long), FUN = function(i) x_df(x_long[i,], x_long[i,]$count))) # create binary_code x_full$binary < as.numeric(x_full$code == 'surv') ### binomial glm with interaction between time and type: summary(fm2<glm(binary ~ time*type, data = x_full, family = "binomial")) ### plot glm in ggplot2 ggplot(x_full, aes(x = time, y = as.numeric(x_full$binary), fill= x_full$type)) + geom_smooth(method="glm", aes(color = factor(x_full$type)), method.args = list(family = "binomial"))
What i'd like to do is force the intercept through 1 (i.e. all surviving in the first time point), almost like a survival curve. Specifying an offset doesn't seem to work in this example, as it is binomial probability, so specifying a 1 or a 0 doesn't work, and the number of surviving vs nonsurviving recruits in the first time point varies among factors).
Stack Overflow, i've learnt more from you than any of my undergraduate classes or postgraduate supervisors. Any suggestions on this one?

brms  how to deal with missing contrast (one level is to ignore)
I am trying to use the brms package to study changes in parasitisme rate in four butterfly species between 3 conditions. All species but one were tested in each condition and there come my problem. Species 4 was not tested in condition 3.
my full model is:
mod15=brm(#ind_parasitizedtrials(#ind_parasitized+#ind_nonparasitized)~0+condition*butterfly_species,data=dat3,iter=5000,warmup=2000,core=4,family=binomial)
this model return the error that it failed to converge. However, from the model summary, the Rhat, Eff. sample, and mixed of the chains are good except, obviously, for the missing constrat species 4 in region 3 that the model failed to ignore/exclude.
Reading around, I have seen how to deal with missing values but I could not figure out out to deal with a missing constrast. I thought I might get around using brm_multiple but this did not work as in that case I try to test constrasts that are not existing any longer.
any suggestion would be greatly appreciated!

Converting WinBUGS Codes for Bayesian Estimation into R
I have WinBUGS codes below which I need to use in R. However because I am not familiar with WinBUGS, I would appreciate any help to convert them into R.
Here are the codes:
#code 1 model { for (i in 1:N){ odds[i] < exp(b0) * (1 + b1*A[i] + b2*B[i] + b3*A[i]*B[i]) pp[i] < odds[i]/(1 + odds[i]) case[i] ~ dbin(pp[i], nn[i]) } b0 ~ dnorm(0, precb0) b1 ~ dnorm(0, precb1) b2 ~ dnorm(0, precb2) b3 ~ dnorm(0, precb3) # ensuring appropriate constraints for b1, b2 and b3 ones < 1 ones ~ dbern(C1) C1 < step(1 + b1)*step(1 + b2)*step(1 + b1 + b2 + b3) } #code 2 model { for (i in 1:N){ pp[i] < exp(a0) * (1 + a1*A[i] + a2*B[i] + a3*A[i]*B[i?] case[i] ~ dbin(pp[i], nn[i]) } a0 ~ dnorm(0, preca0) a1 ~ dnorm(0, preca1) a2 ~ dnorm(0, preca2) a3 ~ dnorm(0, preca3) # ensuring appropriate constraints for a0, a1, a2 and a3 ones < 1 ones ~ dbern(C1) C1 < step(0  a0)*step(1 + a1)*step(exp( a0)  1  a1)*step(1  a2)* step(exp(a0) 1  a2)*step(1 + a1 + a2 + a3)*step(exp( a0)  1  a1  a2  a3) }
I want to convert and implement these codes in R using the data below but I am not certain how to accomplish that:
(dat < data.frame( A = c(1, 1, 0, 0), B = c(1, 0, 1, 0), case = c(278, 100, 153, 79), negative= c(743, 581, 1232, 1731)))
Thank you in advance for any help.

How to calculate 95% confidence intervals using bayesboot()
I need help to calculate bootstrapbased credible intervals of the quantity
qtt.ci
from my regressioncoef.def
.So far my attempts have resulted in:
Error in quantile.default(s, c(0.025, 0.25, 0.5, 0.75, 0.975)) : missing values and NaN's not allowed if 'na.rm' is FALSE
preceded by:
Warning message: In bayesboot(dat, boot_fn) : The sample from bayesboot contains either NAs, NaNs or NULLs. Make sure that your statistic function only return actual values.
Here are my sample data:
dat < data.frame( A = c(1, 1, 0, 0), B = c(1, 0, 1, 0), Pass = c(278, 100, 153, 79), Fail = c(743, 581, 1232, 1731)
Below is my regression. The quantity I want to get the bootstrapbased 95% credible intervals is
qtt.ci
:boot_fn < function(dat) { coef.def = unname(coef(glm(cbind(Pass, Fail) ~ A * B, binomial, dat))) } qtt.ci < exp(sum(coef.def[2:4]))  exp(coef.def[2])  exp(coef.def[3]) + 1
Here is my attempt:
bb_ci < bayesboot(dat, boot_fn) summary(bb_ci)
Not certain how to get the bootstrapbased confidence intervals for
qtt.ci
.Thank you in advance.
EDIT:
Following the answer by @RuiBarradas, I tried doing bootstrap to get the 95% CI for the quantity qtt.ci (which is the quantity for which I want to get the bootstrapped CI), but no success:
library(bayesboot) boot_fn < function(dat) { coef.def < unname(coef(glm(cbind(Pass, Fail) ~ A * B, binomial, dat))) qtt< (exp(sum(coef.def[2:4]))  exp(coef.def[2])  exp(coef.def[3]) + 1) if(all(!is.na(qtt))) qtt else NULL } Runs < 1e2 qtt.ci < bayesboot(dat, boot_fn, R = Runs, R2 = Runs) summary(qtt.ci) Quantiles: statistic q2.5% q25% median q75% q97.5% V1 2.705878 2.705878 2.705878 2.705878 2.705878
Therefore, this does not give the CI for
qtt.ci
. The output is simply the point estimate forqtt
:qtt<(exp(sum(coef.def[2:4]))  exp(coef.def[2])  exp(coef.def[3]) + 1) qtt [1] 2.705878
Any help would be much appreciated.

Test for conditional independence in python as part of the PC algorithm
I'm implementing the PC algorithm in python. Such algorithm constructs the graphical model of a nvariate gaussian distribution. This graphical model is basically the skeleton of a directed acyclic graph, which means that if a structure like:
(x1)(x2)(x3)
Is in the graph, then x1 is independent by x3 given x2. More generally if A is the adjacency matrix of the graph and A(i,j)=A(j,i) = 0 (there is a missing edge between i and j) then i and j are conditionally independent, by all the variables that appear in any path from i to j. For statistical and machine learning purposes, it is be possible to "learn" the underlying graphical model. If we have enough observations of a jointly gaussian nvariate random variable we could use the PC algorithm that works as follows:
given n as the number of variables observed, initialize the graph as G=K(n) for each pair i,j of nodes: if exists an edge e from i to j: look for the neighbours of i if j is in neighbours of i then remove j from the set of neighbours call the set of neighbours k TEST if i and j are independent given the set k, if TRUE: remove the edge e from i to j
This algorithm computes also the separating set of the graph, that are used by another algorithm that constructs the dag starting from the skeleton and the separation set returned by the pc algorithm. This is what i've done so far:
def _core_pc_algorithm(a,sigma_inverse): l = 0 N = len(sigma_inverse[0]) n = range(N) sep_set = [ [set() for i in n] for j in n] act_g = complete(N) z = lambda m,i,j : m[i][j]/((m[i][i]*m[j][j])**0.5) while l<N: for (i,j) in itertools.permutations(n,2): adjacents_of_i = adj(i,act_g) if j not in adjacents_of_i: continue else: adjacents_of_i.remove(j) if len(adjacents_of_i) >=l: for k in itertools.combinations(adjacents_of_i,l): if Nlen(k)3 < 0: return (act_g,sep_set) if test(sigma_inverse,z,i,j,l,a,k): act_g[i][j] = 0 act_g[j][i] = 0 sep_set[i][j] = set(k) sep_set[j][i] = set(k) l = l + 1 return (act_g,sep_set)
a is the tuningparameter alpha with which i will test for conditional independence, and sigma_inverse is the inverse of the covariance matrix of the sampled observations. Moreover, my test is:
def test(sigma_inverse,z,i,j,l,a,k): def erfinv(x): #used to approximate the inverse of a gaussian cumulative density function sgn = 1 a = 0.147 PI = numpy.pi if x<0: sgn = 1 temp = 2/(PI*a) + numpy.log(1x**2)/2 add_1 = temp**2 add_2 = numpy.log(1x**2)/a add_3 = temp rt1 = (add_1add_2)**0.5 rtarg = rt1  add_3 return sgn*(rtarg**0.5) def indep_test_ijK(K): #compute partial correlation of i and j given ONE conditioning variable K part_corr_coeff_ij = z(sigma_inverse,i,j) #this gives the partial correlation coefficient of i and j part_corr_coeff_iK = z(sigma_inverse,i,K) #this gives the partial correlation coefficient of i and k part_corr_coeff_jK = z(sigma_inverse,j,K) #this gives the partial correlation coefficient of j and k part_corr_coeff_ijK = (part_corr_coeff_ij  part_corr_coeff_iK*part_corr_coeff_jK)/((((1part_corr_coeff_iK**2))**0.5) * (((1part_corr_coeff_jK**2))**0.5)) #this gives the partial correlation coefficient of i and j given K return part_corr_coeff_ijK == 0 #i independent from j given K if partial_correlation(i,k)K == 0 (under jointly gaussian assumption) [could check if abs is < alpha?] def indep_test(): n = len(sigma_inverse[0]) phi = lambda p : (2**0.5)*erfinv(2*p1) root = (nlen(k)3)**0.5 return root*abs(z(sigma_inverse,i,j)) <= phi(1a/2) if l == 0: return z(sigma_inverse,i,j) == 0 #i independent from j <=> partial_correlation(i,j) == 0 (under jointly gaussian assumption) [could check if abs is < alpha?] elif l == 1: return indep_test_ijK(k[0]) elif l == 2: return indep_test_ijK(k[0]) and indep_test_ijK(k[1]) #ASSUMING THAT IJ ARE INDEPENDENT GIVEN Y,Z <=> IJ INDEPENDENT GIVEN Y AND IJ INDEPENDENT GIVEN Z else: #i have to use the independent test with the zfisher function return indep_test()
Where z is a lambda that receives a matrix (the inverse of the covariance matrix), an integer i, an integer j and it computes the partial correlation of i and j given all the rest of variables with the following rule (which I read in my teacher's slides):
corr(i,j)REST = var^1(i,j)/sqrt(var^1(i,i)*var^1(j,j))
The main core of this application is the indep_test() function:
def indep_test(): n = len(sigma_inverse[0]) phi = lambda p : (2**0.5)*erfinv(2*p1) root = (nlen(k)3)**0.5 return root*abs(z(sigma_inverse,i,j)) <= phi(1a/2)
This function implements a statistical test which uses the fisher's ztransform of estimated partial correlations. I am using this algorithm in two ways:
 Generate data from a linear regression model and compare the learned DAG with the expected one
 Read a dataset and learn the underlying DAG
In both cases i do not always get correct results, either because I know the DAG underlying a certain dataset, or because i know the generative model but it does not coincide with the one my algorithm learns. I perfectly know that this is a nontrivial task and I may have misunderstand theoretical concept as well as committed error even in parts of the code i have omitted here; but first i'd like to know (from someone who is more experienced than me), if the test i wrote is right, and also if there are library functions that perform this kind of tests, i tried searching but i couldn't find any suitable function.

Normal and naive Bayes  duplicating data, improves classification [Weka]
I wanted to examine how bayes network will behave if I only will manipulate size of data set.
Since I still think about data generator (cant find proper article) I have decided to just multiple the data I have. I thought if I will only duplicate it, it won't change the classification results  just model creation time.
Well I did the tests on 2 datasets:
data1: 70 instances, 8 attributes, yes/no classification
data2: 768 instances, 206 attributes, 16 type classification
The results on computing time on original sets are between 00.01 sec, so I can't get much from it. The interesting numbers comes after I multiplied the data.
The testing method was crossvalidation on 10 folds. Everything done in Weka.
.
Format:
Number_of_instances : model_creation_time : %Correctly_Clasified_instances
Data1, Naive Bayes:
768 : 0s : 76.3021%
7680 : 0.01s : 76.2891%
76800 : 0.11s : 76.2708%
768000 : 1.39s : 76.3021%
Data1, Normal Bayes:
768 : 0.02s : 74.349%
7680 : 0.08s : 85.7943%
76800 : 1.43s : 97.8659%
768000: 26.18s : 97.824%
Data2, Naive Bayes:
70 : 0s : 82.8571%
700 : 0s : 92%
7000: 0.02s : 94.2857%
70000 : 0.21sec : 94.2857%
Data2, Normal bayes:
70 : 0.01s : 82.8571%
700 : 0.01s : 93.2857%
7000 : 0.11s : 94.2857%
70000 : 2.63s : 94.2857%
The computing time in all cases increased (Normal bayes took more time  but in data2 with more attributes it grows much slower).
The properly classified instances highly raised only in normal bayes in data1 (8 attributes).
While on data2 (206 attributes) classification improved greatly on both methods.
How the duplication of data helped in classification? Did I do something wrong or those are some kind of fake results? Can it be explained?

Weka: why getMargin returns all zeros?
I am using Weka Java API. I trained a Bayesnet on an Instances object (data set)
data
./** * Initialization */ Instances data = ...; BayesNet bn = new EditableBayesNet(data); SearchAlgorithm learner = new TAN(); SimpleEstimator estimator = new SimpleEstimator(); /** * Training */ bn.initStructure(); learner.buildStructure(bn, data); estimator.estimateCPTs(bn);
getMargin
returns marginal distibution for a node. Ideally, assuming nodeA
has 3 possible values, and its node index is 0. Then,bn.getMargin(0)
should return something like[0.3, 0.4, 0.3]
.However, in my case, when I print marginal distributions of all nodes using the following code.
for (int i = 0; i <bn.getNrOfNodes(); i++) System.out.println(Arrays.toString(bn.getMargin(i)));
It returns
[0.0, 0.0, 0.0, ...] [0.0, 0.0, 0.0, ...] [0.0, 0.0, 0.0, ...] ...
Someone has seen this before or can possibly give me some hints why this happens?
My data are nominal and most of the columns have many classes/values.

How to incorporate weights into a likelihood function in the context of maximum likelihood modelling
I want to incorporate weights into the likelihood to do what the
svyglm
does with weights (i.e., create pseudopopulations).According to @Jeremy Miles and elsewhere, svyglm function uses weights to "weight the importance of each case to make them representative (to each other, ...)".
Here are my data:
(dat < data.frame( A = c(1, 1, 0, 0), B = c(1, 0, 1, 0), Pass = c(278, 100, 153, 79), Fail = c(743, 581, 1232, 1731), Weights= c(3, 1, 12, 3) ))
Here is my likelihood function:
ll < function (b0, b1, b2, b3) { odds < exp(b0) * (1 + b1 * dat$A + b2 * dat$B + b3 * dat$A * dat$B) sum(dbinom( x = dat$Pass, size = rowSums(dat[, 3:4]), prob = odds / (1 + odds), log = TRUE)) }
Thank you in advance.

MLE with vectors as arguments
I'm trying to use the
mle()
function in R, but it seems the start list argument can only be a list of single elements, not vectors.Is there a way I could pass a list of vectors?
Below is what I have currently:
inits < list(position=c(0.5,0.5,0.2, 0.2,0.3,0.3,0.4,0), weight=c(5,1), beta=c(1,1), prob_s=c(0.5,0.5)) fit < mle(LL, start = inits)
My LL function looks as below:
LL < function(position, weight, beta, prob_s)

Obtaining Standard Errors in Optim() in R
I'm using a maximum likelihood estimation and I'm using the
optim()
function inR
in a similar way as follows:optim(c(phi,phi2, lambda), objf, method = "LBFGSB", lower = c(1.5, 1.5, 0), upper = c(1.5, 1.5, 1), model = model_gaussian)
(where
objf
is the function to estimate parameters)Now, I would like to obtain the standard errors (in order to calculate tstats) for the parameters I am estimating. However, apparently since this optimisation is a constrained maximum likelihood estimation (due to the "LBFGSB") I cannot use the Hessian matrix to obtain standard errors since it would not be mathematically correct, so this means I can't just add
hessian=TRUE
to theoptim
line.For example, if I were using the
BFGS
(which is not restricted/constrained) rather than theLBFGSB
method, to find standard errors is pretty simple:opt < optim(c(phi,phi2, lambda), objf, method = "LBFGSB", lower = c(1.5, 1.5, 0), upper = c(1.5, 1.5, 1), model = model_gaussian, hessian = TRUE) standarderrors < sqrt(abs(diag(solve(opt$hessian))))
For this reason I thought I'd ask: what's the best way to obtain standard errors for my problem? What is the appropriate method to going about doing it? Thanks in advance.

Accuracy Measure for Bayesian Network
I was trying with INSURANCE data in bnlearn. For measuring the model accuracy, I tried with different nodes for prediction. With each node, the prediction shows different accuracy rate. So, my doubt is which node should I choose to measure the entire modelâ€™s accuracy?

How to get the joint probability using BNlearn R package
I am using BNlearn package to learn BN from a dataset like this, I couldn't find a function to return the joint probability of a set of variables given the BN.
I can only find conditional query but not joint probability query.
Any idea, please.

Bayesian Network in Python
Currently, I use bnlearn library in R to implement a Bayesian Network model. The model requires a node structure that domain experts design. I feed the model with the node structure and a training dataset.
Now, I need to replicate the same thing in Python. I searched through different libraries and did not find something similar. The most similar one perhaps is Naive Base in SciKit. However, Naive Bayes does not require any node structure!
I appreciate if anyone can help me to replicate my Rmodel in Python.