Is there a solution to the floating point issue?
I understand that certain numbers can't be represented as a floating point value, but is there another way to correct this if it affects my analysis?
> a<6.959435821
>
> sprintf("%5.25f", a)
[1] "6.9594358209999995779071469"
See also questions close to this topic

How to fix the intercept in a binomial GLM model?
Looking for some help fitting a binomial model with a fixed intercept. I've seen this question covered successfully with linear models by subtracting the explicit intercept from the regression and and then fit the interceptfree model:
intercept < 1.0 fit < lm(I(x  intercept) ~ 0 + y, lin) summary(fit)
Or by specifying an offset:
m(x ~ y +0 + offset(rep(1, nrow(lin))),
But I can't figure out how to do this with a binomial regression. Example code below:
library(ggplot2) library(lme4) library(dplyr) library(gather) # simulate dummy dataframe x < data.frame(time = c(1, 1, 1, 1, 1, 1,1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2,3, 3, 3, 3, 3, 3, 3, 3, 3,4, 4, 4, 4, 4, 4, 4, 4, 4), type = c('a', 'a', 'a', 'b', 'b', 'b','c','c','c','a', 'a', 'a', 'b', 'b', 'b','c','c','c','a', 'a', 'a', 'b', 'b', 'b','c','c','c','a', 'a', 'a', 'b', 'b', 'b','c','c','c'), randef=c('aa','ab','ac','ba','bb','bc','ca','cb','cc','aa','ab','ac','ba','bb','bc','ca','cb','cc','aa','ab','ac','ba','bb','bc','ca','cb','cc','aa','ab','ac','ba','bb','bc','ca','cb','cc'), surv = sample(x = 1:200, size = 36, replace = TRUE), nonsurv= sample(x = 1:200, size = 36, replace = TRUE)) # convert to survival and non survival into individuals following # https://stackoverflow.com/questions/51519690/convertcbindformatfor binomialglminrtoadataframewithindividualrows x_long < x %>% gather(code, count, surv, nonsurv) # function to repeat a data.frame x_df < function(x, n){ do.call('rbind', replicate(n, x, simplify = FALSE)) } # loop through each row and then rbind together x_full < do.call('rbind', lapply(1:nrow(x_long), FUN = function(i) x_df(x_long[i,], x_long[i,]$count))) # create binary_code x_full$binary < as.numeric(x_full$code == 'surv') ### binomial glm with interaction between time and type: summary(fm2<glm(binary ~ time*type, data = x_full, family = "binomial")) ### plot glm in ggplot2 ggplot(x_full, aes(x = time, y = as.numeric(x_full$binary), fill= x_full$type)) + geom_smooth(method="glm", aes(color = factor(x_full$type)), method.args = list(family = "binomial"))
What i'd like to do is force the intercept through 1 (i.e. all surviving in the first time point), almost like a survival curve. Specifying an offset doesn't seem to work in this example, as it is binomial probability, so specifying a 1 or a 0 doesn't work, and the number of surviving vs nonsurviving recruits in the first time point varies among factors).
Stack Overflow, i've learnt more from you than any of my undergraduate classes or postgraduate supervisors. Any suggestions on this one?

Extract coefficient functions from a fitted pfrobject from refund R package
I am trying to extract coefficients from a fitted
pfr
object.I fit Functional Generalized Linear Model (FGLM) of the form:
E[Y_i] = X_i' \beta + \int_T f[t, Z_i(t)] dt
,I then try to extract coefficients via
coef(fit_FGAM, n = c(n_1,n_2))
.From the documentation, I read:
n
 integer vector indicating the number of equally spaced coordinates for each argument. If length 1, the same number is used for each argument. Otherwise, the length must matchobject$smooth[[select]]$dim
.
It is left unclear to me:
Which element of, say, 2element vector
n = c(n_1,n_2)
corresponds to argumentt
, and which to argumentZ_i
?I presume (guess) that by default, the range of coordinates for argument
t
(time) is(min(t), min(t))
, given allt
coordinates are the same for all observations  is this the case?How the range of coordinates
Z_i
is defined? Is it a range of all (scalar) values present in dataset withinZ_i
across all time points and all observations?

R foreach loop runs out of memory in HPC environment
I am using the foreach package in R to process raster files.
The R code below works fine locally (on Windows) when adapted to an 8core processor, but runs out of memory in a HPC environment with 48 cores. The HPC environment has much more memory available (2 TB across all 48 cores) compared with my local box (32 GB), so that's not the limiting factor.
The memory creep occurs as the foreach loop proceeds. It's slow, but enough to eventually run out of memory.
I have tried switching parallel packages (to
doMC
,doSNOW
), adding numerous garbage collection calls andrm()
of large objects at the end of every iteration, fiddling with the number of cores used, as well as removing any temporary files immediately.Any ideas on what may be causing my memory issues?
# Set Java memory maximum options(java.parameters = "Xmx39g") library(sp) library(raster) library(dismo) library(foreach) library(doParallel) library(rgdal) library(rJava) # Set directories relPath < "E:/BIEN_Cactaceae/" bufferDir < "Data/Buffers" climDir < "Data/FutureClimate/" outDir < "Analyses/FutureRanges/" modelDir < "Analyses/MaxEnt/" outfileDir < "OutFiles/" tempDir < "E:/Tmp/" # Set directory for raster temporary files rasterOptions(tmpdir = tempDir) # Search for models models < list.files(path = paste0(relPath, modelDir), pattern = "rda$") # Set up cluster cl < makeCluster(48, type = "FORK", outfile = paste0(relPath, outfileDir, "predictFuture.txt")) registerDoParallel(cl) # Loop through species and predict current ranges foreach(i = 1:length(models), .packages = c("sp", "raster", "dismo", "rgdal", "rJava"), .inorder = F) %dopar% { # Get taxon taxon < strsplit(models[i], ".", fixed = T)[[1]][1] # Get buffer tmpBuffer < readOGR(dsn = paste0(relPath, bufferDir), layer = paste0(taxon, "_buff"), verbose = F) # Get scenarios scenarios < list.files(path = paste0(relPath, climDir), pattern = "tif$") # Get model load(paste0(relPath, modelDir, models[i])) # Loop over scenarios for (j in scenarios) { # Get scenario name tmpScenarioName < strsplit(j, ".", fixed = T)[[1]][1] # Skip scenario if already processed if (!file.exists(paste0(relPath, outDir, taxon, "_", tmpScenarioName, ".tif"))) { # Read, crop, mask predictors print(paste0(taxon, "  ", tmpScenarioName, ": processing")) tmpScenarioStack < raster::stack(paste0(relPath, climDir, j)) preds < raster::crop(tmpScenarioStack, tmpBuffer) preds < raster::mask(preds, tmpBuffer) # Rename predictors tmpNames < paste0(taxon, ".", 1:20) tmpNames < gsub("", ".", tmpNames, fixed = T) tmpNames < gsub(" ", "_", tmpNames, fixed = T) names(preds) < tmpNames # Predict with model prediction < dismo::predict(model_all, preds, progress = "") # Export predictions writeRaster(prediction, paste0(relPath, outDir, taxon, "_", tmpScenarioName, ".tif")) removeTmpFiles(h = 2) } } } stopCluster(cl)