List the column names of multiple data frames as one data frame
How would I capture the column names of multiple data frames (all having differing number of columns) into one data frame; so that i can quickly see what columns each data frame has and which ones may be common to all.
I've got a small example below but it does not get me exactly what I'm after. shorter columns repeat to match the length of the longer columns. (If I use a list how can view each data frame's column names sidebyside)?
dfA < data.frame(colA = runif(10), colB=LETTERS[1:10])
dfB < data.frame(col1 = rnorm(5), col2=LETTERS[11:15], col3 = LETTERS[20:24], colB=LETTERS[14:18])
listDF < list(dfA=names(dfA), dfB=names(dfB))
View(as.data.frame(listDF))
See also questions close to this topic

How can I predict cumulative incidence following a competing risks regression using crprep and coxph?
I've estimated a competing risks hazard regression (specifically the Fine and Gray (1999) model) using the combination of
crprep
to create censoring weights which are then given tocoxph
in theweights
argument. The problem is that the raw coefficients of the Fine and Gray model don't have any easy interpretation (see here). The solution to that problem is to predict the cumulative incidence at different points based on the regression (an example here, but usingcrr
instead ofcrprep
/coxph
). But I can't readily figure out how to get predictions out of thecoxph
object produced by thecrprep
/coxph
combo. Themstate
package manual is no help on this point either.I've tried plugging my
coxph
result into bothpredict()
andsurvfit
, but didn't get results that made much sense, although possibly I didn't set them up correctly.I should note that I set my data up as multiple objects per spell because I have time varying covariates, in case that matters.

Split and reconcatenate a string in R
I am trying to get the host of an IP address from a list of strings.
ips < c('140.112.204.42', '132.212.14.139', '31.2.47.93', '7.112.221.238')
I want to get the first 2 digits from the ips. output:
ips < c('140.112', '132.212', '31.2', '7.112')
This is the code that I wrote to convert them:
cat(unlist(strsplit(ips, "\\.", fixed = FALSE))[1:2], sep = ".")
When I check the type of individual ips in the end I get something like this:
140.112 NULL
Not sure what I am doing wrong. If you have some other ideas completely different from this that is completely fine too. Thank you for nay help in advance

using holidays as external regressors in Auto.ARIMA, but giving very MAPE and errors
I am having 2 years sales data, and I am using special holidays of US as external regressors, which are all zeros and filled with 1 where it was an holiday so, I can say that all values are zeros itself, I am using fourier transform and passing my data into AUTO.ARIMA as well as in forecast hybrid model. But, I am unlucky and mt error is quite high.
Here is my code and data.
require("ggplot2") library("tseries") library("forecast") library(chron) library(Holidays) library(HolidayCalendars) setwd("D://Users/Shivam/Desktop/go2venky/") #Load sales data  columns dsdate and ysales volume sales_data < read.csv("indirect.csv",header=TRUE, stringsAsFactors = FALSE) head(sales_data) #format the date into Date column as we are reading from CSV sales_data$Date < as.Date(sales_data$ds) #ggplot(sales_data, aes(x=Date, y=y))+geom_line() + scale_x_date('month') + ylab ('Daily Sales Volume') #remove any outliers from the data set No_outliers < ts(sales_data[,c('y')]) sales_data$smoothvolume < tsclean(No_outliers) #plot the chart ggplot(sales_data, aes(x=Date, y=smoothvolume)) + geom_line() + scale_x_date('month') + ylab('Daily Sales Volume') + coord_cartesian(ylim=c(0,300)) + geom_smooth(method="lm") #Decompose data and see STL  Season, Trend, Reminder sales_ma < ts(na.omit(sales_data$smoothvolume),frequency = 30) decomp_sales < stl(sales_ma, s.window = "periodic") plot(decomp_sales) adj_sales = seasadj(decomp_sales) #Build Holidays set to see in Arima #end < length(adj_sales[,1]) end < 895 #count data is hardcoded to number of rows in the data file thanksgiving < rep(0,end) christmas < rep(0,end) newyear < rep(0,end) memorial < rep(0,end) independence < rep(0,end) labor < rep(0,end) veterans < rep(0,end) goodfriday < rep(0,end) easter1 < rep(0,end) Product1Launch < rep(0,end) year < rep(0,end) for (i in 1:end) { date < as.Date(sales_data[i,1],format="%m/%d/%Y") year[i] < as.numeric(format(date, "%Y")) thanksgivingday < holiday(as.numeric(format(date, "%Y")), Holiday="USThanksgivingDay") christmasday < holiday(as.numeric(format(date, "%Y")), Holiday="USChristmasDay") newyearday < holiday(as.numeric(format(date, "%Y")), Holiday="USNewYearsDay") memorialday < holiday(as.numeric(format(date, "%Y")), Holiday="USMemorialDay") independenceday < holiday(as.numeric(format(date, "%Y")), Holiday="USIndependenceDay") laborday < holiday(as.numeric(format(date, "%Y")), Holiday="USLaborDay") veteransday < holiday(as.numeric(format(date, "%Y")), Holiday="USVeteransDay") goodfridayday < holiday(as.numeric(format(date, "%Y")), Holiday="USGoodFriday") easterday1 < holiday(as.numeric(format(date, "%Y")), Holiday="Easter") Product1LaunchDay1 < as.Date('20170914', format="%Y%m%d") #date hardcoded for the year #USNewYearsDay, USInaugurationDay, USMLKingsBirthday, USLincolnsBirthday, USWashingtonsBirthday, #USMemorialDay, USIndependenceDay, USLaborDay, USColumbusDay, USElectionDay, USVeteransDay #USThanksgivingDay, USChristmasDay, USCPulaskisBirthday, USGoodFriday if(as.numeric(date) == as.numeric(thanksgivingday)){thanksgiving[i:(i+4)]<1} #consider 4 days after Thanksgiving as holiday peak if(as.numeric(date) == as.numeric(christmasday)){christmas[(i10):(i+5)]<1} #consider days before and after christmas also as holiday peak if(as.numeric(date) == as.numeric(newyearday)){newyear[i]<1} if(as.numeric(date) == as.numeric(memorialday)){memorial[i]<1} if(as.numeric(date) == as.numeric(independenceday)){independence[i]<1} if(as.numeric(date) == as.numeric(laborday)){labor[i]<1} if(as.numeric(date) == as.numeric(veteransday)){veterans[i]<1} if(as.numeric(date) == as.numeric(goodfridayday)){goodfriday[i]<1} if(as.numeric(date) == as.numeric(easterday1)){easter1[i]<1} if(as.numeric(date) == as.numeric(Product1LaunchDay1)){Product1Launch[i]<1} } special_days < cbind(thanksgiving, christmas, newyear, memorial, independence, labor, veterans, goodfriday, easter1, Product1Launch) View(special_days) # endf < 5 thanksgivingf < rep(0,endf) christmasf < rep(0,endf) newyearf < rep(0,endf) memorialf < rep(0,endf) independencef < rep(0,endf) laborf < rep(0,endf) veteransf < rep(0,endf) goodfridayf < rep(0,endf) easterf < rep(0,endf) Product1Launchf < rep(0,endf) yearf < rep(0,endf) for (i in 1:endf) { datef < as.Date(sales_data[i,1],format="%m/%d/%Y") yearf[i] < as.numeric(format(datef, "%Y")) thanksgivingdayf < holiday(as.numeric(format(datef, "%Y")), Holiday="USThanksgivingDay") christmasdayf < holiday(as.numeric(format(datef, "%Y")), Holiday="USChristmasDay") newyeardayf < holiday(as.numeric(format(datef, "%Y")), Holiday="USNewYearsDay") memorialdayf < holiday(as.numeric(format(datef, "%Y")), Holiday="USMemorialDay") independencedayf < holiday(as.numeric(format(datef, "%Y")), Holiday="USIndependenceDay") labordayf < holiday(as.numeric(format(datef, "%Y")), Holiday="USLaborDay") veteransdayf < holiday(as.numeric(format(datef, "%Y")), Holiday="USVeteransDay") goodfridaydayf < holiday(as.numeric(format(datef, "%Y")), Holiday="USGoodFriday") easterdayf < holiday(as.numeric(format(datef, "%Y")), Holiday="Easter") Product1LaunchDayf < as.Date('20170914', format="%Y%m%d") #date hardcoded for the year #USNewYearsDay, USInaugurationDay, USMLKingsBirthday, USLincolnsBirthday, USWashingtonsBirthday, #USMemorialDay, USIndependenceDay, USLaborDay, USColumbusDay, USElectionDay, USVeteransDay #USThanksgivingDay, USChristmasDay, USCPulaskisBirthday, USGoodFriday. if(as.numeric(date) == as.numeric(thanksgivingdayf)){thanksgivingf[i:(i+4)]<1} if(as.numeric(date) == as.numeric(christmasdayf)){christmasf[(i10):(i+5)]<1} if(as.numeric(date) == as.numeric(newyeardayf)){newyearf[i]<1} if(as.numeric(date) == as.numeric(memorialdayf)){memorialf[i]<1} if(as.numeric(date) == as.numeric(independencedayf)){independencef[i]<1} if(as.numeric(date) == as.numeric(labordayf)){laborf[i]<1} if(as.numeric(date) == as.numeric(veteransdayf)){veteransf[i]<1} if(as.numeric(date) == as.numeric(goodfridaydayf)){goodfridayf[i]<1} if(as.numeric(date) == as.numeric(easterdayf)){easterf[i]<1} if(as.numeric(date) == as.numeric(Product1LaunchDayf)){Product1Launchf[i]<1} } special_daysf < cbind(thanksgivingf, christmasf, newyearf, memorialf, independencef, laborf, veteransf, goodfridayf, easterf, Product1Launchf) #Build the Auto Arima with z  full dataset, zf  future for 5 days #Auto.Arima with external regression variables (holidays as covariates) x < msts(adj_sales,seasonal.periods=c(7,365.25)) z < fourier(x, K=c(2,5)) zf < fourierf(x, K=c(2,5), h=5) fit < auto.arima(x, xreg=cbind(z,special_days), seasonal=FALSE) fc < forecast(fit, xreg=cbind(zf,special_daysf), h=5) fc plot(fc) accuracy(fc) # ME RMSE MAE MPE # Training set 0.8078107 8610.461 6407.322 79.02104 # MAPE MASE ACF1 # Training set 247.9248 0.4788379 0.003789123 ########################################## ####### Forecast Hybrid Model ############ ########################################## library(forecastHybrid) # Create the model hy_model < hybridModel(x, models = "at", a.args = list(xreg = cbind(z,special_days))) # Forecast future values hy_model_fc < forecast(hy_model, xreg = cbind(zf,special_daysf)) plot(hy_model_fc) accuracy(hy_model_fc) # ME RMSE MAE MPE MAPE MASE # Training set 175.1741 7948.943 5668.673 31.12881 167.3496 0.4236366 # ACF1 # Training set 0.007117223
data  https://drive.google.com/open?id=1wcKKeldFfrPEOx_6fHf2rMCRGKpPGFPy xreg  train  https://drive.google.com/open?id=1X39bMZGLWL5L3NrVLTu8JUfEWeGNyXnM xreg  test  https://drive.google.com/open?id=11IXVoVV4C_zd8XCCtbNzhY0UVBZmJrt

How to print Unique Squares Of Numbers In Java 8?
Here is my code to find the unique number and print the squares of it. How can I convert this code to java8 as it will be better to stream API?
List<Integer> numbers = Arrays.asList(3, 2, 2, 3, 7, 3, 5); HashSet<Integer> uniqueValues = new HashSet<>(numbers); for (Integer value : uniqueValues) { System.out.println(value + "\t" + (int)Math.pow(value, 2)); }

Data structure that inserts in constant time at endpoints and before/after an element?
I am looking for a data structure that:
 Has an unbounded size.
 Maintains the insertion order of its elements.
 Inserts efficiently at the beginning and end of the list (ideally in constant time).
 Inserts efficiently before or after an existing element (ideally in constant time).
I ruled out
ArrayList
because it isn't efficient at inserting at the beginning of the list.On the surface
LinkedList
should be a perfect match, but in practice the Java implementation isn't efficient at inserting before or after existing elements (i.e. it walks the entire list to find the insertion position).Is there a 3rdparty Collection that does this?
Update: For my particular usecase, elements do not implement equals() so it is impossible for them to be duplicate.
Motivation: I am building an event queue that allows occasional cheating (inserting before or after an existing event).

Counting smileys in a List of strings
I am trying to count the occurrence of smileys in a given
List
ofStrings
.Smileys are in format of
:
or;
for eyes, optional nose
or~
, and mouth of)
orD
.import java.util .*; public class SmileFaces { public static int countSmileys(List<String> arrow) { int countF = 0; for (String x : arrow) { if (x.charAt(0) == ';'  x.charAt(0) == ':') { if (x.charAt(1) == ''  x.charAt(1) == '~') { if (x.charAt(2) == ')'  x.charAt(2) == 'D') { countF++; } else if (x.charAt(1) == ')'  x.charAt(1) == 'D') { countF++; } } } } return countF; } }

Rename Columns of dataframe based on names of list in R
I have multiple dataframes saved in a list object. They share the same two column names. I'd like to rename the second column to the name of the dataframe.
Example Data:
df1 < data.frame(A = 1:10, B= 11:20) df2 < data.frame(A = 21:30, B = 31:40) df3 < data.frame(A = 31:40, B= 41:50) df4 < data.frame(A = 51:80, B = 61:70) listDF < list(df1, df2,df3, df4)
I'm trying to use lapply to rename the second column to match the name of the dataframe.
# trying to rename second column after the element of the list they're located in listDF_2 < lapply(names(listDF), function(x) setNames(listDF[[x]], x) )

Python  data.to_csv output format
From a csv file having the following format:
Date,Data 010101,111 020202,222 030303,333
I am calculating the monthly average of the values using the following code:
data = pd.read_csv("input.csv") data['Month'] = pd.DatetimeIndex(data.reset_index()['Date']).month mean_data = data.groupby('Month').mean()
Then I output a csv file using the following command:
mean_data.to_csv("test.csv")
It works fine and give me the following output:
Month,Data 01,01 02,02 03,03 04,04 ...
But now I would like to know how many data have been included inside the monthly average calculation. For that I changed:
mean_data = data.groupby('Month').mean()
by:
mean_data = data.groupby(['Month']).agg(['mean', 'count'])
But the problem comes now. When I want to output the csv , I now have a weird format as follow:
Data,Data, mean,count, Month, 01, 01,8, 02, 02,9, 03, 03,7, 04, 04,5,
Which is not really convenient. Instead I would like to have the following output:
Month,Mean,Count 01,01,8 02,02,9 03,03,7 04,04,5
Does anyone know how to achieve that?

New to R: need help concatenating column names
I am very new to R, and am struggling to get started. Specifically, I am generating 5 different prediction and adding those predictions to an existing dataframe. My code is:
For j in i{ … actual.predicted < data.frame(test_data, predicted) }
I am trying to concatenate words together to create new column names, in the loop. Specifically, I have a column named “predicted” and I am generating predictions in each iteration of the loop. So, in the first iteration, I want the new column name to be “predicted.1” and for the second iteration, the new column name should be “predicted.2” and so on.
Any thoughts would be greatly appreciated.