Bind spatial data frames
I have a long panel dataset (Data_Base) with the name of the 32 states in Mexico through 28 years in time (928 obs). I have in another .rds file the spatial data for these 32 states (mex.sf), in total of 32 obs. I want to bind the dataset with the spatial data. Of course the spatial data from the file mex.sf should be repeated 28 times per state due the difference in longitude of observations. I try with the following command specifying the condition to merge per name of state:
example <- merge(Data_Base, mex.sf, by=intersect(Data_Base$State,mex.sf$NAME_1))
Nevetheless I ended up with the following error message:
Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
Bellow an example of how the data looks like
Does anyone knows why cannot bind them?
do you know?
how many words do you know
See also questions close to this topic
-
pivot_wider does not keep all the variables
I would like to keep the variable
cat
(category) in the output of my function. However, I am not able to keep it. The idea is to apply a similar function tom <- 1 - (1 - se * p2)^df$n
based on the category. But in order to perform that step, I need to keep the variable category.Here's the code:
#script3 suppressPackageStartupMessages({ library(mc2d) library(tidyverse) }) sim_one <- function() { df<-data.frame(id=c(1:30),cat=c(rep("a",12),rep("b",18)),month=c(1:6,1,6,4,1,5,2,3,2,5,4,6,3:6,4:6,1:5,5),n=rpois(30,5)) nr <- nrow(df) df$n[df$n == "0"] <- 3 se <- rbeta(nr, 96, 6) epi.a <- rpert(nr, min = 1.5, mode = 2, max = 3) p <- 0.2 p2 <- epi.a*p m <- 1 - (1 - se * p2)^df$n results <- data.frame(month = df$month, m, df$cat) results %>% arrange(month) %>% group_by(month) %>% mutate(n = row_number(), .groups = "drop") %>% pivot_wider( id_cols = n, names_from = month, names_glue = "m_{.name}", values_from =m ) } set.seed(99) iters <- 1000 sim_list <- replicate(iters, sim_one(), simplify = FALSE) sim_list[[1]] #> # A tibble: 7 x 7 #> n m_1 m_2 m_3 m_4 m_5 m_6 #> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 1 0.970 0.623 0.905 0.998 0.929 0.980 #> 2 2 0.912 0.892 0.736 0.830 0.890 0.862 #> 3 3 0.795 0.932 0.553 0.958 0.931 0.798 #> 4 4 0.950 0.892 0.732 0.649 0.777 0.743 #> 5 5 NA NA NA 0.657 0.980 0.945 #> 6 6 NA NA NA 0.976 0.836 NA #> 7 7 NA NA NA NA 0.740 NA
Created on 2022-05-07 by the reprex package (v2.0.1)
-
calculate weighted average over several columns with NA
I have a data frame like this one:
ID duration1 duration2 total_duration quantity1 quantity2 1 5 2 7 3 1 2 NA 4 4 3 4 3 5 NA 5 2 NA
I would like to do a weighted mean for each subject like this:
df$weighted_mean<- ((df$duration1*df$quantity1) + (df$duration2*df$quantity2) ) / (df$total_duration)
But as I have NA, this command does not work and it is not very nice....
The result would be this:
ID duration1 duration2 total_duration quantity1 quantity2 weighted_mean 1 5 2 7 3 1 2.43 2 NA 4 4 3 4 4 3 5 NA 5 2 NA 2
Thanks in advance for the help
-
I am to extract data from netCDF file using R for specific loaction the code i've written as showen and I have an error at the end of the code
I need some help with extracting date from NetCDF files using R , I downloaded them from cordex (The Coordinated Regional climate Downscaling Experiment). In total I have some files. This files have dimensions of (longitude, latitude, time) and the variable is maximum temperature (tasmax). At specific location, I need to extract data of tasmax at different time. In total I have some files. This files have dimensions of (longitude, latitude, time) and variable maximum temperature (tasmax). At specific location, I need to extract data of tasmax at different time.I wrote the code using R but at the end of code, an error appeared. Error ( location subscript out of bounds)
getwd() setwd("C:/Users/20120/climate change/rcp4.5/tasmax")
dir() library ("ncdf4") libra,-ry(ncdf4.helpers) library ("chron") ncin <- nc_open("tasmax_AFR-44_ICHEC-EC-EARTH_rcp45_r1i1p1_KNMI-RACMO22T_v1_mon_200601-201012.nc") lat <- ncvar_get(ncin, "lat") lon <- ncvar_get(ncin, "lon") tori <- ncvar_get(ncin, "time") title <- ncatt_get(ncin,0,"title") institution <- ncatt_get(ncin,0,"institution") datasource <- ncatt_get(ncin,0,"source") references <- ncatt_get(ncin,0,"references") history <- ncatt_get(ncin,0,"history") Conventions <- ncatt_get(ncin,0,"Conventions") tustr <- strsplit(tunits$value,"") ncin$dim$time$units ncin$dim$time$calendar tas_time <- nc.get.time.series(ncin, v = "tasmax", time.dim.name = "time") tas_time[c(1:3, length(tas_time) - 2:0)] tmp.array <- ncvar_get(ncin,"tasmax") dunits <- ncatt_get(ncin,"tasmax","units") tmp.array <- tmp.array-273.15 tunits <- ncatt_get(ncin,"time","units") nc_close(ncin) which.min(abs(lat-28.9)) which.min(abs(lon-30.2)) tmp.slice <- tmp.array[126,32981,] tmp.slice
Error in tmp.array[126, 32981, ] : subscript out of bounds
-
Coloring polygons (made with concaveman) in ggplot by column
I'm adapting code for a map made by someone else that uses a package
concaveman
to generate concave hulls from points. The goal is to plot a number of different polygons in the oceans, and to color-code them by a grouping variable. The code works great to make a map of all the polygons and color-code them by identity:library(sf) library(concaveman) library(data.table) library(ggplot2) dat <- data.table(longitude = c(-131.319783, -131.141266, -131.08165, -131.079066, -130.894966, -131.063783, -131.10855, -131.215533, -131.189816, -131.14565, -131.200866, -131.046466, -130.94055, -130.928983, -130.7513, -130.8406, -130.833433, -130.830666, -130.82205, -130.89, -63.3666666666667, -63.3666666666667, -63.1666666666667, -64.1833333333333, -63.3166666666667, -63.3, -63.85, -63.9333333333333, -63.9333333333333, -63.5833333333333, -63.5833333333333, -63.7, -63.7, -63.2833333333333, -63.5833333333333, -63.95, -64.1833333333333, -63.8833333333333, -63.8, -63.2166666666667, -5.6788, -5.4408, -5.6835, -5.424, -5.6475, -5.4371, -5.6181, -5.4446, -5.6753, -5.4366, -5.6746, -5.4448, -5.6642, -5.4411, -5.666, -5.4408, -5.624, -5.4321, -5.6806, -5.4473), latitude = c(52.646633, 52.589683, 52.556516, 52.559816, 52.402916, 52.5983, 52.554216, 52.550883, 52.539166, 52.658216, 52.627966, 52.481733, 52.486033, 52.469033, 52.469166, 52.261833, 52.292133, 52.301066, 52.3523, 52.366966, 48.4666666666667, 48.4666666666667, 48.65, 49.0166666666667, 48.8166666666667, 48.8166666666667, 49.1, 48.8666666666667, 48.8666666666667, 48.8, 48.8166666666667, 48.4833333333333, 48.4833333333333, 48.8, 48.8166666666667, 48.8833333333333, 49.05, 49.0833333333333, 48.7166666666667, 48.6666666666667, 54.7201, 54.6033, 54.7191, 54.5733, 54.7225, 54.5923, 54.7261, 54.6076, 54.719, 54.5978, 54.7195, 54.6108, 54.7204, 54.6062, 54.7214, 54.5923, 54.7275, 54.592, 54.7207, 54.6188), group = c(rep('NEPac',20),rep('NWAtl',20),rep('NEAtl',20)) ) split <- split(dat, dat$group) split.sf <- lapply(split, st_as_sf, coords = c("longitude", "latitude")) concave <- lapply(split.sf, concaveman, concavity = 3, length_threshold = 2) concave.binded <- do.call('rbind', concave) concave.spdf <- as_Spatial(concave.binded) ggplot() + geom_polygon(data = concave.spdf, aes(x = long, y = lat, group = group, fill = group, color = group))
However, I can't figure out how to fill the polygons by anything other than whatever
group
is. Here is my attempt:concave.spdf$ocean <- c('P','A','A') ggplot() + geom_polygon(data = concave.spdf, aes(x = long, y = lat, group = group, fill = ocean, color = ocean))
Which throws this error:
Error in FUN(X[[i]], ...) : object 'ocean' not found
I think the issue is that
split
groups the polygons by identity when passed toconcaveman
, but if I change that, they won't plot correctly (because the points of different polygons will be merged). How do I keep the polygons plotted individually but color them by a grouping variable? (If it's possible I'd prefer to stick withconcaveman
for aesthetic reasons in the true plot [which is much more complicated than this reprex] -- I know that if I use a different approach to plotting the polygons this would be easier.) -
R terra remove raster cells touching spatvector lines
I'm trying to calculate shapes of touching raster cells within areas using terra. I haven't been able to reproduce my problem using sample data, so I hope someone might still be able to help without that (I tried, but somehow couldn't replicate the issue). Here goes: I have a raster layer of 200mX200m cells of the US. I want to get characteristics of the largest patch of cells within each State. My current issue is that some cells/patches seem to be popping up in multiple states after later identifying patches and turning them into a SpatVector. I am now trying to figure out if the issue arises early on, when cropping and masking OR whether it is an issue that comes when turning the patches into a SpatVector (though I think it's the former and my masking is not working properly).
If anybody could help me figure out how to crop out/mask the cells that are touching a State border that would be highly appreciated!!!
Here's my current stylized approach (again, I really tried to get an example going, sorry!):
library(terra) ### get raster layer layr <- rast(file1) ### get shapefile of three bordering regions to crop state_shp <- st_read(file2) state2 <- vect(state_shp$geometry) ### crop & keep only what's inside state polygons ## HERE NEED TO DROP ALL CELLS THAT ARE TOUCHING STATE BORDERS layr_cropped <- crop(layr,state2 , mask=T, snap = "in") ##also tried: mask(layr,state2 , touches = T) ### turn into patches & then spatvector layr_patched <- patches(layr_cropped, 8) testvector <- terra::as.polygons(layr_patched, trunc=T, dissolve = T)
-
How to do a double join: a spatial join AND an attribute join using sql query on DB manager of qgis?
I have a point layer and a polygon layer. The dots represent the stops of a bus line. The polygons represent a 10m buffer around the lines (split all 100m of bus line). I would keep counting and summing their value for each point that intersects my polygons. However, as several bus lines and several stops overlap, a basic location join is not sufficient because I end up with the sum of the stops of all the overlapping lines. I guess I need to do a spatial AND attribute join using the common id between stops and bus line. Do you have an idea for a query to test in the qgis DB manager?
Thanks :), Wendy
-
ggplot x axis label cant be edited
library(sf) library(maptools) library(scales) library(rnaturalearth) library(rnaturalearthdata)
I have a grid data frame that i want to add to ggplot as an raster object
plot_df<- dput(plot_df[100:150,c(1,2,22)]) structure(list(longitude = c(-179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75, -179.75), latitude = c(-40.25, -39.75, -39.25, -38.75, -38.25, -37.75, -37.25, -36.75, -36.25, -35.75, -35.25, -34.75, -34.25, -33.75, -33.25, -32.75, -32.25, -31.75, -31.25, -30.75, -30.25, -29.75, -29.25, -28.75, -28.25, -27.75, -27.25, -26.75, -26.25, -25.75, -25.25, -24.75, -24.25, -23.75, -23.25, -22.75, -22.25, -21.75, -21.25, -20.75, -20.25, -19.75, -19.25, -18.75, -18.25, -17.75, -17.25, -16.75, -16.25, -15.75, -15.25), chl = c(5.79614072001568, 5.88809180584808, 5.91470803081587, 6.02307441926183, 5.94829246190804, 5.90227592841317, 5.88679405479741, 5.86218190882011, 5.88041651864321, 6.07368487605027, 6.1523111308037, 6.08351738769765, 6.05085211903314, 6.1526656045802, 6.01818374516031, 6.06720061938859, 6.10903571379908, 6.08132230310023, 5.94930399887433, 5.88474219391825, 5.9661907387131, 5.94088644166972, 5.7699915104762, 5.75660811309503, 5.72855772442633, 5.75380875490745, 5.74173250244927, 5.6893502273505, 5.65375136847291, 5.61335816690552, 5.62649249254121, 5.56363852997544, 5.61024930490138, 5.5855319776649, 5.57533893757262, 5.62878350382851, 5.67701322880346, 5.65414261630743, 5.65748798686245, 5.65999372947895, 5.70239818840709, 5.71014558066082, 5.68826542087147, 5.65905910276594, 5.54868461636391, 5.53320312262907, 5.55743825882603, 5.74248561230272, 5.76249373157709, 5.25381860438204, 5.02448076170133)), row.names = c(NA, -51L), class = c("tbl_df", "tbl", "data.frame"))
I have very nice layout using
geom_sf()
and world data from naturalearth package. Specifically axes labels are formatted nicely. However adding raster object, somehow ruins that x-axes labelworld <- ne_countries(scale = "medium", returnclass = "sf") # add continents ggplot(data = world) + geom_sf(color = "black", fill = "grey40") + geom_raster(data=plot_df, aes(y= latitude, x = longitude, fill = chl)) + scale_fill_gradientn(colours = rev(rainbow(7)), na.value = NA) + scale_x_continuous(breaks = seq(-180, 180, by = 40))+ theme_bw() + coord_sf(expand = FALSE) + labs(fill="log(chl)", x="Longitude", y="Latitude")
X axes coordinates are only showing 180s, and I cant find a way to resolve it
-
Adding names on map using tmap/tmaptools
I have made a map using the cited packages and code. However, I would like to add names or country codes on top of each country like in picture two. Is this possible?
library(sf) library(tmap) library(tmaptools) library(leaflet) library(dplyr) library(tidyverse) options(scipen = 999) data <- read_excel("Datakart.xlsx") EU <- st_read("NUTS_RG_20M_2021_3035.shp", stringsAsFactors = FALSE) EU_and_data <- inner_join(EU, data) euro <- c(2377294, 1313597, 7453440, 5628510) tm_shape(EU_and_data, bbox = euro) + tm_polygons("2015-2019", id = "CNTR_ID", palette = "Blues", style = "cont")
Data:
structure(list(NUTS_ID = c("AT", "DK", "FI", "FR", "LU", "DE" ), `2010-2014` = c(0.3635, 0.9272, 0.5081, 0.8051, 0.8051, 0.78 ), ...3 = c("***", "**", "***", "***", "***", "***"), `2015-2019` = c(0.6019, 0.8196, 0.2917, 0.757, 0.757, 0.849), ...5 = c("***", "***", "*", "***", "***", "***"), STUSPS = c("NY", "NJ", "PA", "FL", "GA", "NC"), `2010 - 2014` = c(0.2941, 0.3832, 0.2895, 0.2835, 0.3128, 0.2557), ...8 = c("***", "***", "***", "***", "***", "***"), `2015 - 2019` = c(0.1821, 0.2473, 0.24, 0.313, 0.4137, 0.366), ...10 = c("***", "***", "***", "***", "***", "***")), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
-
Trying to export sf dataframe to .csv file in R with st_write
I have a sf dataframe which has geometry column looking like this:
geometry list(list(c(53252.151421, 421516.3252... list(list(c(53252.151421, 421516.3252... list(list(c(53252.151421, 421516.3252...
and I'm trying to export this whole df to .csv using this command:
st_write(df, "name.csv", layer_options = "GEOMETRY=AS_WKT")
However, after running the command I get this csv file.
this is the exported .csv after running the command
how can I fix this?
-
saving R leaflet map as html: Raster not shown in Firefox/ Safari but in Chrome
Unfortunately, I struggle to show a reproducible example here. The issue occurs only, when I add a locally saved raster file (140mb) to a leaflet map. And I don't know how to share such a large file here. The map shows fine in the rStudio viewer and in Chrome. But it fails to show the raster in Firefox and Safari (Firefox on OSX and iOS). Even auto zoom and pan does not work.
When I add a system raster file, everything works as expected:
library(leaflet) library(leafem) library(raster) library(stars) #to access the stars system.file tif = system.file("tif/L7_ETMs.tif", package = "stars") x2 <- raster(tif) map_raster <- leaflet() %>% addTiles() %>% leafem:::addGeoRaster( x2, opacity = 1, colorOptions = colorOptions( palette = grey.colors(256) ) ) library(htmlwidgets) saveWidget(map_raster, file = "TestExport_raster.html", selfcontained = T)
But when I replace x2 with the local raster, it does not work. Tried adding an object from the
raster
and thestars
package, but with no difference.Any help on that? Is the raster just too big for a leaflet-map?
The question might be related to this one: saving R leaflet map as html: tiles not included
Best, Beni
-
Setting colors by factor in country map in R (raster package)
I am trying to assign each region of Kenya a color according to the vector with colors I made. I have already made a vector with colors as elements and region names as names. I think I have already everything I need, but I simply don´t know how to assign those colors to the right region. I only used data from the raster package and a numerical vector called:
This is the data I´d like to use to color the map
library(raster) library("viridis") Kenya = getData("GADM", country = "Kenya", level = 1) Kenya@data Kpoly <- Kenya@polygons ggplot(fortify(Kenya), aes(long, lat, group = group, fill = id)) + geom_polygon() + theme_void() + coord_equal() + scale_fill_manual("Kenya - Z score by region", values = colors_map, labels = colfac2)
I already tried to play with the parameters, but I can´t quite get the result I want.
-
multinomial logit
I'm stuck with running a multinomial logit regression in R. The data preview is attached for the reference. How should I run it? I'm new to R, and need to do this for applied econometrics using R. Can you help me with reshaping data and running multinomial regression?
> head(data) marketindex x1_prod1 x2_prod1 x3_prod1 x1_prod2 x2_prod2 x3_prod2 x1_prod3 x2_prod3 x3_prod3 x1_prod0 x2_prod0 x3_prod0 choice 1 1 7.459917 1 7.267866 6.67054 1 7.633743 8.444682 0 11.30016 0 0 0 3 2 1 7.459917 1 7.267866 6.67054 1 7.633743 8.444682 0 11.30016 0 0 0 2 3 1 7.459917 1 7.267866 6.67054 1 7.633743 8.444682 0 11.30016 0 0 0 3 4 1 7.459917 1 7.267866 6.67054 1 7.633743 8.444682 0 11.30016 0 0 0 2 5 1 7.459917 1 7.267866 6.67054 1 7.633743 8.444682 0 11.30016 0 0 0 2 6 1 7.459917 1 7.267866 6.67054 1 7.633743 8.444682 0 11.30016 0 0 0 2
- pData Analysis using language R