Sequence of dates by each next 15th or last day of month [R]
I have start and end dates as e.g. 03052014 and 01072014. I would like to create a code so that it would output sequences such as next 15th of each month or next end day of the month, in this case 15052014 & 15062014, or 31052014 and 31062014.
I know this would be possible by creating sequence of days and then identifying the particular day (at least for the 1st case) as in seq(1st date, 2nd date, by = "day")
, but let's just say that for computational limitations this is not possible  I have to create it for many years and millions of records which need to be grouped as well.
Is there any workaround there?
2 answers

d1 = as.Date("03052014", "%d%m%Y") d2 = as.Date("01072014", "%d%m%Y") library(lubridate) d1_p = round_date(d1, unit = "month") d2_p = round_date(d2, unit = "month") mydates = seq.Date(d1_p, d2_p, "months") mydates = mydates[mydates < d2_p] lapply(mydates, function(x) x + 14:15) #[[1]] #[1] "20140515" "20140516" #[[2]] #[1] "20140615" "20140616" ceiling_date(mydates, unit = "month")  1 #[1] "20140531" "20140630"

start < as.Date("20140503") end < as.Date("20140701") library(lubridate) floor_date(seq(start, end, by = 'month'), unit = "month") + 14 ceiling_date(seq(start, end, by = 'month'), unit = "month")1
Sequence by the month and use
floor_date
from thelubridate
package to start at the beginning of the month.
See also questions close to this topic

R  nested loop for list of SpatialLinesDataFrame intersected with SpatialPolygonsDataFrame objects
I have a series of steps I need to complete on a list of
SpatialLinesDataFrame
('lines' herein) objects based on their relationships with individual features within a multifeatureSpatialPolygonsDataFrame
('polygons') object. In short, each line list element originates inside a single polygon feature, and may or may not pass through one or more other polygon features. I want to update each line element to connect origin polygons to the first point of contact for each individual polygon intersected by the line element. So, each line element may become multiple new line features (n=number of intersected polygons).I would like to do this efficiently as my lines lists and polygon features are numerous. I have provided example data and STEPwise description of what I am trying to do below. I am new to R and not a programmer, so I don't know if any of what I propose is valid.
library(sp) library(rgdal) library(raster) ###example data prep START #example 'RDCO Regional Parks' data can be downloaded here: https://datardco.opendata.arcgis.com/datasets group_ids=1950175c56c24073bb5cef3900e19460 parks < readOGR("/path/to/example/data/RDCO_Regional_Parks/RDCO_Regional_Parks.shp") plot(parks) #subset watersheds for example parks_sub < parks[parks@data$Shapearea > 400000,] plot(parks_sub, col='green', axes = T) #create SpatialLines from scratch pts_line1 < cbind(c(308000, 333000), c(5522000, 5530000)) line1 < spLines(pts_line1, crs = crs(parks_sub)) plot(line1, axes=T, add=T) #origin polygon = polyl[[4]] = OBJECTID 181 pts_line2 < cbind(c(308000, 325000), c(5524000, 5537000)) line2 < spLines(pts_line2, crs = crs(parks_sub)) plot(line2, axes=T, add=T) #origin polygon = polyl[[8]] = OBJECTID 1838 linel < list() linel[[1]] < line1 linel[[2]] < line2 #convert to SpatialLinesDataFrame objects lineldf < lapply(1:length(linel), function(i) SpatialLinesDataFrame(linel[[i]], data.frame(id=rep(i, length(linel[[i]]))), match.ID = FALSE)) #match id field value with origin polygon lineldf[[1]]@data$id < 181 lineldf[[2]]@data$id < 1838 ###example data prep END #initiate nested for loop for (i in 1:length(lineldf)) { for (j in 1:length(parks_sub[j,])) { #STEP 1:for each line list feature (NB: with ID matching origin polygon ID) #identify whether it intersects with a polygon list feature if (tryCatch(!is.null(intersect(lineldf[[i]], parks_sub[j,])), error=function(e) return(FALSE)) == 'FALSE'){ next } #if 'FALSE', go on to check intersect with next polygon in list #if 'TRUE', go to STEP 2 #STEP 2: add intersected polygon OBJECTID value to SLDF new column in attribute table #i.e., deal with single intersected polygon at a time else { lineldf[[i]]@data["id.2"] = parks_sub[j,]@data$OBJECTID #STEP 3: erase portion of line overlapped by intersected SPDF line_erase < erase(lineldf[[i]],parks_sub[j,]) #STEP 4: erase line feature(s) that no longer intersect with the origin polygon #DO NOT KNOW HOW TO SELECT feature (i.e., line segment) within 'line_erase' object if (tryCatch(!is.null(intersect(line_erase[???], parks_sub[j,])), error=function(e) return(FALSE)) == 'FALSE'){ line_erase[???] < NULL} else { #STEP 5: erase line features that only intersect with origin polygon if (line_erase[???]@data$id.2 = parks_sub[j,]@data$OBJECTID){ line_erase[???] < NULL } else { #STEP 6: write valid line files to folder writeOGR(line_erase, dsn = paste0("path/to/save/folder", i, ".shp"), layer = "newline", driver = 'ESRI Shapefile', overwrite_layer = T) }}}}}

How to format ggplot `geom_text` with formula, getting unwanted "c(...)"
In my ggplot2 code below, I want to show the formula for a linearregression fit on my plot with
geom_text
, but I get unwantedc
before the values ofa
andb
, how do I prevent this?p < ggplot(data=Algae, aes(x=a254, y=DOC))+ geom_point(color="blue",stat="identity") + geom_smooth(method="lm",se=FALSE,color="red",formula=y~x) model.lm < lm(DOC~a254, data=Algae) l < list(a=format(coef(model.lm)[1], digits=4), b=format(coef(model.lm)[2], digits=4), r2=format(summary(model.lm)$r.squared, digits=4), p=format(summary(model.lm)$coefficients[2,4], digits=4)) eq < substitute(italic(DOC) == a  b %*% italic(a254)~","~italic(R)^2~"="~r2~", "~italic(P)~"="~p, l) p1 < p + geom_text(aes(x =6, y = 0, label = as.character(as.expression(eq))), parse = TRUE) p1

Hyperparameter Optimization to Find Minimum of Function in R
I have a dataset of 48 months that compares Actual and Estimated values based on a function. The function relies on a rate, X. I am trying to minimize the sum errors (ActualEstimated) for the entire time period. I am able to get the optimized X with optimize in R. However, I want to be able to find the right X for particular time periods that would give me a smaller sum of errors for the entire time period. What is the best way to do this in R? Or what method would work best for me to do this?
Actual Accounts = Actual Observations Cancelled = lag(Estimated Accounts) * .05 Estimated Accounts = Actual_Account  Cancelled

Measure the time with time.time() and Keydown
I got sounds that I play from a list called
sounds
. It plays a sound, store the time when the sound is played instart
, waits 6 seconds and plays the next sound from the list. Now I want to capture a reaction time between these 6 seconds with a keydown. If the condition is true then I click the button and it captures the time and store it inend
. Then, the difference betweenend
andstart
should give me the result. The problem is, that it does not measure the time right. It always gives me millisconds, even if I way longer bfore I click. I wonder what I am doing wrong here?start = time.time() for i in range(len(arr)): pygame.mixer.music.load(sounds[i]) pygame.mixer.music.play() for e in pygame.event.get(): if e.type == pygame.KEYDOWN: if e.key == pygame.K_RIGHT: if condition: end = time.time() diff = end  start while pygame.mixer.music.get_busy(): time.sleep(6)

Find difference in dates among entries across multiple columns
df1=
Date Team1 Team2 6/1 Boston New York 6/13 New York Boston 6/27 Boston New York
I am trying to calculate the number of days since the last time Boston appeared in either column, but I can only figure out how to look it up within one column, using df1['Days since Boston played'] = df1.groupby('Team1')['Date'].diff().fillna(0)
What I would like the output to be:
Date Team1 Team2 Days since Boston played 6/1 Boston New York 0 6/13 New York Boston 12 6/27 Boston New York 14
EDIT  expanding the dataframe to learn how this can be applied to all teams, not just one What I would like the output to be:
Date Team1 Team2 Days since **Team1** played 6/1 Boston New York 0 6/13 New York Chicago 12 6/27 Boston New York 14 6/28 Chicago Boston 15

Ruby Date Gem Invalid Date
So I am iterating through a hash where one of the key/values is {date: => 'MM/DD/YYYY'}
When I iterate through, I am using the date gem to find out what day of the week that each date is, (06).
To get a day of the week for the index I am currently at as an integer so i can compare it to another integer, the idea is to check if the day of the week of the index is the same as the day of the week i am searching for.
To get that int I run the following commands:
d = Date.parse(hash[i].values[2]) day_of_the_week = d.cwday
When i do this on its own for just a cherrypicked date this works fine, but I am iterating through the hash, what i get is:
search.rb:25:in `parse': invalid date (ArgumentError)
for the particular date '9/13/17'.
Is there something wrong with '9/13/17'? Why does this actually work for other days (it starts at '9/5/17') and then get randomly stuck at this day?
And as I was writing this, I did a little digging and found exactly what index it was:
d = Date.parse(hash[4224].values[2]) day_of_the_week = d.cwday
Gives me the same error, I am completely baffled, what is going on? Also its not the lack of MM in 9/etc because every other month is the same way.
EDIT: The result should be 2, September 12th 2017 was a Tuesday.

seq creates surprising intervals
I recently saw this on twitter and started to wonder what's going on.
I read the seq document, but fail to understand why seq(from=1.4, to=1.4, by=0.2) doesn't produce 1.4, 1.2, 1.0....
Any idea?
As expected
i < seq(from=1.5,to=1.5, by=0.2) format(i, scientific = F) #> [1] "1.5" "1.3" "1.1" "0.9" "0.7" "0.5" "0.3" "0.1" " 0.1" " 0.3" #> [11] " 0.5" " 0.7" " 0.9" " 1.1" " 1.3" " 1.5"
Unexpected
i < seq(from=1.4,to=1.4, by=0.2) format(i, scientific = F) #> [1] "1.3999999999999999111822" "1.1999999999999999555911" #> [3] "0.9999999999999998889777" "0.7999999999999998223643" #> [5] "0.5999999999999998667732" "0.3999999999999999111822" #> [7] "0.1999999999999997335465" " 0.0000000000000002220446" #> [9] " 0.2000000000000001776357" " 0.4000000000000001332268" #> [11] " 0.6000000000000000888178" " 0.8000000000000002664535" #> [13] " 1.0000000000000004440892" " 1.2000000000000001776357" #> [15] " 1.3999999999999999111822"

Difficulties plotting confidence intervals onto 3d plot plane
I have predicted values and confidence intervals that I want to add to my 3D plot using
trans3d
, but I get an error on the line that usesseq
I already triedlength(z.bin)
, and read other possible solutions, but it's still not working.Error in seq.default(lowerCI, upperCI, length.out = 25) : 'from' must be of length 1
I hope you can help me to fix my code. Here are the predicted values (z.bin), upper CI (UCI) and lower CI (LCI):
z.bin= c(0.0293498087331418, 0.090245714112389, 0.184180408140189, 0.288479689911685, 0.380290727519617, 0.447221380019439, 0.486749948207999, 0.515460732539617, 0.524544278048373, 0.517863012982977, 0.499015552138662, 0.471040830332284, 0.436384769878271, 0.39696995466237, 0.354295721949241, 0.309542936297033, 0.263681366413638, 0.217589473510825, 0.172201272125033, 0.128688774135519, 0.0886552840745102, 0.0542241604227149, 0.0277504883386967, 0.0108213094005216, 0.00277584412160996) UCI=c(0.0366603230533126, 0.0902131425743432, 0.190710608825939, 0.329281535177887, 0.37359325824382, 0.49083302601992, 0.502923852215148, 0.532414036794941, 0.542594424500199, 0.544876477822669, 0.513975201348124, 0.500360540087923, 0.460641689148807, 0.415363280410005, 0.358399020245284, 0.321189810843667, 0.285678220416678, 0.234306786216362, 0.185151688725085, 0.141800528101782, 0.0848830167493455, 0.0596895934068413, 0.034797331186028, 0.0136423698337293, 0.00416130620917585) LCI=c(0.0203880237502624, 0.0639803379126716, 0.15252099326726, 0.279883133515488, 0.321969495145084, 0.433138773211774, 0.445700330934391, 0.474863237969827, 0.485779389412345, 0.489219946727086, 0.461012139808171, 0.449297954511444, 0.412682077834953, 0.370799794091489, 0.317884618001687, 0.283779930784182, 0.251320227770169, 0.20400383106003, 0.158982316141284, 0.119627373509671, 0.0683623411169277, 0.0464255905587446, 0.0252020843583765, 0.00810835262770212, 0.0014811836711362)
code (please don't run the lines points() and trans3d(), res2 is not included, but it's there to show you the loop I want to use to create the CI bars):
y.bin < rep(1,25) x.bin < seq(10,10,length.out = 25) # points(trans3d(x.bin, y.bin, z.bin, pmat = res2), col = 1, pch = 16) for (i in 1:length(z.bin)) { lowerCI < LCI upperCI < UCI CI.bar < seq(lowerCI,upperCI,length.out=25) # lines (trans3d(x.bin[i], y.bin[i], z = CI.bar, pmat = res2), col = #1, lwd=2) }

Get sequence of numbers in bash without using seq
I want to loop over a sequence of numbers in a BASH script, but the bound values are not constants. I know this syntax:
for year in {2000..2010}; do echo ${year} done
But the values of 2000 and 2010 are changing in my case, so what I am doing right now is this:
for year in `seq ${yeari} ${yeare}`; do echo ${year} done
Is there a bashnative way to do the same, without using
seq
?