Shapefile and location coordinates don't overlap each other
I am trying to overlay a map of country with pointers on specific locations. For this I first downloaded the boundary of country and then the lat/lon values of the points I want to plot
library(rgeoboundaries)
boundary <- geoboundaries("Mongolia")
library(MODIStools)
points <- mt_sites() %>%
filter(country == "Mongolia")
Then I tried to plot them together using ggplot2
but they dont overlay on each other.
library(ggplot2)
ggplot() +
geom_sf(data = boundary) +
geom_point(data = points,
aes(x = latitude,
y = longitude))
1 answer
-
answered 2021-04-08 07:04
Bappa Das
Your
points
is adata.frame
not aSpatialPointsDataFrame
. So, first I have converted thedata.frame
toSpatialPointsDataFrame
usingcoordinates(points) <- ~longitude + latitude
. Then I have assigned it acrs
(+proj=longlat +datum=WGS84 +no_defs
). Then I have converted thesp
object tosf
object usingst_as_sf
and then I have plotted it likelibrary(rgeoboundaries) library(raster) library(tidyverse) library(sf) library(MODISTools) boundary <- geoboundaries("Mongolia") points <- mt_sites() %>% filter(country == "Mongolia") #Check the coordinate reference of points and boundary crs(points) #> [1] NA crs(boundary) #> CRS arguments: +proj=longlat +datum=WGS84 +no_defs #See the class of the points class(points) #> [1] "data.frame" #Convert the data.frame to SpatialPointsDataFrame coordinates(points) <- ~longitude + latitude #Assign the crs to points SpatialPointsDataFrame crs(points) <- crs(boundary) #Convert sp object to sf object points_sf <- st_as_sf(points) #Plot the results ggplot() + geom_sf(data = boundary) + geom_sf(data = points_sf)
See also questions close to this topic
-
Access Twitter Fact Check Algorithm with `rtweet`
I'm trying to access the Twitter API using the
R
packagertweet
. Using the commandrtweet::get_timeline("@me")
, I'm able to pull the latest 100 tweets from the account @me. I'm trying to see if these tweets have a warning label on them, and if so what was the probability of them being labeled. I read twitter uses an algorithm to flag tweets with potential misinformation then sends them to human review. Is this information able to be seen?Thanks!
-
Using pivot_wider or similar function with R with repeat measurement data
I have a dataframe of patients in the format of one line per chest x-ray. My columns include a specific measurement on the chest x-ray, the date of the chest x-ray, and then several additional columns that are the same for a given patient (like final outcome).
For example:
+--------+------------+----------+------------+-------------+-----+-------+---------+ | pat_id | index_date | cxr_date | delta_date | cxr_measure | age | admit | outcome | +--------+------------+----------+------------+-------------+-----+-------+---------+ | 1 | 1/2/2020 | 1/2/2020 | 0 | 0.1 | 55 | 1 | 0 | | 1 | 1/2/2020 | 1/3/2020 | 1 | 0.3 | 55 | 1 | 0 | | 1 | 1/2/2020 | 1/3/2020 | 1 | 0.5 | 55 | 1 | 0 | | 2 | 2/1/2020 | 2/2/2020 | 1 | 0.2 | 59 | 0 | 0 | | 2 | 2/1/2020 | 2/3/2020 | 2 | 0.9 | 59 | 0 | 0 | | 3 | 1/6/2020 | 1/6/2020 | 0 | 0.7 | 66 | 1 | 1 | +--------+------------+----------+------------+-------------+-----+-------+---------+
I want to reformat the table so it is one line per patient. My end table I think should look something like the below where each variable is turned into:
cxr_measure_#
where#
is thedelta_date
. In the real dataset, I'll have many of these columns (the # would range from -5 to +30). If there are two rows/values on the same delta_date, ideally I would want to take the mean.+--------+------------+----------------+---------------+---------------+--------------+-----+-------+---------+ | pat_id | index_date | first_cxr_date | cxr_measure_0 | cxr_measure_1 | cxr_measure_2 | age | admit | outcome | +--------+------------+----------------+---------------+---------------+--------------+-----+-------+---------+ | 1 | 1/2/2020 | 1/2/2020 | 0.1 | 0.4 | NA | 55 | 1 | 0 | | 2 | 2/1/2020 | 2/2/2020 | NA | 0.2 | 0.9 | 59 | 0 | 0 | | 3 | 1/6/2020 | 1/6/2020 | 0.7 | NA | NA | 66 | 1 | 1 | +--------+------------+----------------+---------------+---------------+--------------+-----+-------+---------+
Is there an easy way to basically reshape between these two tables? I've played a little bit with pivot_longer and pivot_wider, but wasn't sure how to (1) deal with getting the delta_date in the variable name and (2) how to take the mean if there are two overlapping dates. Also curious if this is easier accomplished in python (did most of the data curation using pandas, but then did some additional data cleaning and analysis in R).
-
Create a new column based on one specific cell of data
I am posting to request help on formatting a list of excel sheets, creating a new column based on one particular cell of data.
My DF looks similar to the following:
1 2 3 4 5 6 7 8 NA NA NA NA NA NA NA NA NA NA Oct 2020 NA NA NA NA NA NA NA Total NA Consumer NA Commercial NA Spending State Metro Area Sales Transaction Sales Transaction Sales Transaction AK Anchorage, AK 9000 120 2000 60 7000 60 AL Montgomery, AL 8000 130 2000 30 6000 1000 I have a list of files which import similarly to this. So far I have processed them as such to form a list:
#Copying files to R working directory #OneDrive location (source) DF_Onedrive <- "C:/Users/-----" #R Project (working directory) DF <- "C:/Users/-----" #List of files to be copied list_of_DF <- list.files(DF_Onedrive, "*.xls") #Copying over to WD file.copy(file.path(DF_Onedrive, list_of_DF), DF) #Reading data from R Project inputs data_DF <- list.files(path = "C:/Users/-----", pattern = '*.xls', full.names = TRUE)
I now want to compile the list together as one file. The source files are quarterly, but have tabs separated by months in the quarter such as for month 1, month 2, month 3.
The approach I was going for was similar to:
for (file in data_DF) { # read in the xls and clean M1_MSA <- read_excel(file, sheet = 10) }
Where M1 represents month 1 and pulls from sheet 10-- then I would run a subsequent loop for M2 in sheet 11, and M3 in sheet 12. I would have a single output file for each month, which I would later append together.
My question is on the cleaning I would need to do during this loop: particularly, for each file I need to place the Date (here Oct 2020) in a column which repeats that value for the sheet being read in, looped for each sheet. I need something similar for Total, Consumer, and Commercial in a new "Segment" column, which merges the 'Sales' columns and 'Transactions' columns.
The data in the end should look like:
Date Spending State Metro Area Segment Sales Transaction Oct 2020 AK Anchorage, AK Total 9000 120 Oct 2020 AK Anchorage, AK Consumer 2000 60 Oct 2020 AK Anchorage, AK Commercial 7000 60 -
Grouping values for facet_grid in R
I have the following dataset:
df = read.table(sep=" | ", header=T, text="combination | priority | boolean | value 0 | 1 | True | 1.4 0 | 2 | True | 2.0 0 | 3 | False | 2.1 1 | 1 | True | 3.2 1 | 2 | True | 54.2 1 | 3 | True | 2.2 2 | 1 | False | 12.1 2 | 2 | False | 44.1 2 | 3 | True | 6.18 3 | 1 | True | 8.76 3 | 2 | False | 23.45 3 | 3 | False | 11.33 4 | 1 | False | 2.23 4 | 2 | False | 98.78 4 | 3 | False | 1.55 ")
To plot the data in R, use the following code:
mapping <- aes( x = ???, y = value, ) plot <- (ggplot(data=df, mapping=mapping) + geom_line() #line by priority + geom_point() + facet_grid(boolean ~ comb_group, scales = "free_x") )
The problem is I can't figure out how to reach the
COMB_GROUP
parameter that I would pass tofacet_grid()
. So, I have in my data set the parametercombination
and I would like to show the values for each combination AND 0. So, there should not be a separate facet for combination 0, instead, the values from that one should be also added to each other group. That way when it plots a facet for combination 1 it will also contain the values for 0, the facet for 2 will have the values from 2 and 0, etc...To explain why I have once geom_line and once geom_point. My idea is to have point for the values that are plotted and these should be connected by priority with a line.
-
How can I plot Latitude_min, Latitude_max, Longitude_min, and Longitude_max to grid/Quadrilateral form using ggplot2?
I want to draw sampling locations on a world map. Please can you help me to plot Latitude_min, Latitude_max, Longitude_min, and Longitude_max in grid/Quadrilateral/square shape using ggplot2 (in R)?
My data format looks like--
Continent Country Region Latitude_min Latitude_max Longitude_min Longitude_max Elevation (m) Asia China China -25.0 45.0 -70 125 1830 Asia China China -25.0 45.0 -105 125 2150 Asia China China -25.0 45.0 -70 105 1370 Asia Asia Asia -2.5 57.5 -62.5 142.5 3810 Asia China East Asia -10 55 60 -150 3499
thank you
-
Facet_wrap labels as panel labels in ggplot
I'm working on a gene expression profile plot where I've faceted the profiles by cluster number. I've been able to get the plot pretty close to what I'm looking for, but I would like to be able to make this with the panel labels in the upper right-hand corner of each respective plotting area.
Here are my data:
structure(list(time_point = c("10", "10", "10", "10", "10", "10", "10", "10", "10", "13", "13", "13", "13", "13", "13", "13", "13", "13", "24", "24", "24", "24", "24", "24", "24", "24", "24", "35", "35", "35", "35", "35", "35", "35", "35", "35"), cluster = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "1", "2", "3", "4", "5", "6", "7", "8", "9", "1", "2", "3", "4", "5", "6", "7", "8", "9", "1", "2", "3", "4", "5", "6", "7", "8", "9"), mean_value = c(-0.426083520451465, 0.604608926627412, 0.76451364891251, -0.471139779941961, 0.2351694555588, -0.705679926146899, -0.454858199321039, -0.165115160845773, 0.724407409818679, 0.781438905910975, 0.146965540891108, -0.769313576080577, -0.364131220020667, -0.977858147148868, 0.723255456138281, 1.27112335113834, -0.32602591779613, -0.411066723030554, -0.229664052498654, -0.854136058469294, -0.66018500031545, 1.2563451985156, 0.544812333405859, 0.693110024751582, -0.350948607542015, -0.424710073864252, -0.160578456515796, -0.125691332960858, 0.102561590950772, 0.664984927483517, -0.421074198552966, 0.197876358184206, -0.710685554742966, -0.465316544275294, 0.915851152506157, -0.15276223027233), sd = c(0.0492840366570553, 0.51236309986752, 0.0969492719939365, 0.0586339372406152, 0.0929657760623659, 0.0473263114084479, 0.0276800874550852, 0.0735227878362672, 1.0416084268533, 1.1513324603848, 0.193252849305954, 0.0172156507400275, 0.0327622261605831, 0.0538099322569242, 0.109168493202137, 0.481243962369782, 0.0826879496732078, 0.0388480633488431, 0.159249818818607, 0.0417074333259725, 0.042500837893243, 0.0527404368558342, 0.0688903901671137, 0.0593712415929212, 0.0152356442432683, 0.0577220450239637, 0.0930169866292114, 0.204810043584311, 0.177841591444567, 0.236871574361693, 0.0369824877579054, 0.210415560702688, 0.067153094681514, 0.043764627685752, 0.429144691634867, 0.0203773261748577)), row.names = c(NA, -36L), groups = structure(list(time_point = c("10", "13", "24", "35"), .rows = structure(list(1:9, 10:18, 19:27, 28:36), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))), row.names = c(NA, 4L), class = c("tbl_df", "tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", "tbl_df", "tbl", "data.frame"))
And my current ggplot code:
# Adjustments to ggplot theme Alex_Theme = theme_bw() + theme(plot.title = element_text(hjust = 0.5, face='plain', size = 12)) + theme(plot.title = element_text(vjust=0)) + theme(plot.subtitle=element_text(size=10, hjust=0.5, face="italic", color="black")) + #theme(legend.position= "none") + theme(panel.border = element_rect(fill=NA, colour = "black", size=0.5)) + theme(axis.text = element_text(face = "plain", size = 12)) + theme(axis.title.x = element_text(margin = margin(t = 6, r = 20, b = 0, l = 0))) + theme(axis.title.y = element_text(margin = margin(t = 0, r = 6, b = 0, l = 0))) + #theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + theme(axis.title = element_text(face="plain", size = 12)) # colorblind pallet :) cbPalette <- c("#000000", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7", "red") # named vector of labels for facet facet_labels <- c("A.", "B.", "C.", "D.", "E.", "F.", "G.", "H.", "I.") names(facet_labels) <- c("1", "2", "3", "4", "5", "6", "7", "8", "9") # plotting ggplot(centroids_long_summary, aes(x=time_point,y=mean_value, group=cluster, colour=as.factor(cluster))) + Alex_Theme + # Facet_Theme + geom_line() + geom_point(size = 1) + geom_errorbar(aes(ymin= mean_value - sd, ymax= mean_value + sd), width=0.1) + ylab("Gene expression") + xlab("Developmental stage") + scale_x_discrete(labels = c("Pre", "Dia-C", "Quies", "Post")) + scale_color_manual(name = "Cluster", values = cbPalette) + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) + theme(legend.position= "none") + facet_wrap(.~cluster, ncol = 3, labeller = labeller(cluster = facet_labels) ) + theme(strip.background = element_rect(fill = "white", color = "white"), strip.text.x = element_text(hjust = 0, size = 12))
However, I'd like to try and get the panel labels inside the plotting areas (preferably without the space that a blank facet label would leave between each row).
-
Increase the BBOX size but keep the ratio (Lon/Lat)
I have a Bbox that is defined by the following values:
xmin: 11.555333537980914 ymin: 47.76067947037518 xmax: 11.995692579075694 ymax: 48.281587762758136
I would like to increase the size of this Bbox but keep the ratio. One approach I tried is to calculate the middle point of the Bbox and calculate a new Bbox with the value of radius increased by 50%. The problem: the ratio gets lost.
How could I increase the size of Bbox to 50% but keep the ratio.
-
For a global dataset of geographic points (in lat/long) how to find nearest neighbors accounting for the spherical nature of our planet
So we've got this working without accounting for the projection issue. The issue is where (and how) to best add the re-projection so that the function returns the value in km rather than the current degrees:
library(raster) library(purrr) library(sf) #example presence data from model r1 <- raster(nrow=360, ncol=720) crs(r1) <- "+proj=longlat +datum=WGS84 +no_defs" values(r1) <- rbinom(ncell(r1), 2, 0.01) r1_points <- rasterToPoints(r1) r1_df <- data.frame(r1_points) r1_presence <- r1_df %>% dplyr::filter(layer==1) #example survey data survey_points <- cbind(rnorm(50) * 5 + 10, rnorm(50) + 50) pt2 <- st_multipoint(cbind(survey_points[,1], survey_points[,2])) #distance between each modelled presence (pt1) and survey point (pt2) get_distances <- function(i, pt2, df) { pt1 <- st_multipoint(cbind(df[i, 1], df[i, 2]), dim = "XY") a <- st_nearest_points(pt1, pt2) return(st_length(a)) } #loop for all modelled presences output <- map_dbl(1:nrow(r1_presence), get_distances, pt2, r1_presence)
Ideally a perfect answer would expand the
get_distances
function to add a new option that does the appropriate re-projection and returns the value in km.There may be a few different approaches here and I'm curious what people will come up with.
-
geographical distance from the geometric centroid of the region in R
How to find the distance from the geometric centroid of a region to the centroid of all regions which have boundary with this region in R?
-
How to add isolated raster values to surrounding class?
Given is an initial raster
r <- raster(ncol=10, nrow=10) values(r) <- c(1,1,2,1,1,1,1,1,3,3, 1,1,1,1,1,1,1,3,3,3, 1,1,1,1,1,3,3,3,3,3, 1,1,1,2,2,3,3,3,3,3, 1,2,2,2,2,3,3,2,3,3, 3,2,2,2,2,3,3,3,3,1, 2,2,2,2,2,2,3,2,3,1, 2,2,2,3,3,3,3,3,3,3, 2,3,3,3,3,3,3,3,3,3, 3,3,3,3,3,3,3,3,3,1) plot(r)
there are several "islands" within clumps. What I'm looking for is a way to add this isolated pixel to the dominant surrounding class. So that I'm receiving the following raster:
r1 <- raster(ncol=10, nrow=10) values(r1) <- c(1,1,1,1,1,1,1,1,3,3, 1,1,1,1,1,1,1,3,3,3, 1,1,1,1,1,3,3,3,3,3, 1,1,1,2,2,3,3,3,3,3, 1,2,2,2,2,3,3,3,3,3, 2,2,2,2,2,3,3,3,3,3, 2,2,2,2,2,2,3,3,3,3, 2,2,2,3,3,3,3,3,3,3, 2,3,3,3,3,3,3,3,3,3, 3,3,3,3,3,3,3,3,3,3) plot(r1)
isolated pixel were add to dominant surrounding class, but remained if at least one neighbouring cell had the same value.
How to achieve this result?
-
R Shiny - filtering data frame with drop down menu before joining with spatial data
Hi! I want to dynamically change the way I filter a dataset before I merge it to a GeoJson file. Is there any way I could do this?
Pop2 is the spatial data PostOfficeCountyfil - is the filtered data frame PostOffice Country is the data frame I want to filter
server <- function(input, output) { Pop2 <-reactive({
PostOfficeCountyfil <- reactive({ PostOfficeCounty %>% filter(YYYYMM == input$month) }) Pop2@data <- Pop2@data %>% mutate(FIPS = paste0(STATE, COUNTY)) Pop2@data <- left_join(Pop2@data, PostOfficeCountyfil, by = c("FIPS"="county"))})
output$leaflet1 <- renderLeaflet({ Pop2 %>% leaflet()%>% #addTiles()%>% addPolygons(fillColor = ~colors1(totalPct), fillOpacity = .7, color = "black", opacity = 1, label = ~NAME.y, popup = ~`2013 code`, weight = 1)%>% setView(-96, 37.8, 3) %>% addLegend(pal = colors1, values = Pop2@data$Pctchange) }) }
-
How to calculate a number of citizens, who have access to each cell of a grid, using R? [accessibuility of a grid cell]
I have the following problem:
Imagine that I have a georeferenced vector hex grid (in .gpkg format), where there is a column "built_pop_sum", containing information about how many citizens live within a particular hex of the grid. I want to add another field to this .gpkg, for instance, "access_pop_sum", where for each hex I want to have a total number of citizens, who can reach this hex. The accessibility is measured simply in hexes; the distance of normal accessibility is, let's say, 10 hexes. How can I calculate such variable? I would also appreciate, if there is a tool to calculate this variable by measuring distance in kilometers.
-
How can I generate a numpy array from a Shapefile in such a way that it has the same dimensions as a Numpy Array obtained from a Raster band?
I am really new at this Flopy + Modflow 6 thing so this is my problem,
First I defined the discretization of a grid from the pixels of a topography raster (50 x 50) by using GetRasterBand(1).ReadAsArray() tool. So I used that array for ztop and zbot, and the Raster Size to define row and column grid size. The problem is that now I have to find the values inside that npArray, to define my rivers and boundaries of my system, which I have in Shapefiles.
When I import these Shapefiles and use np.array(Shapefile.shapeRecords()[0].shape.points) to create a Numpy Array from it, the dimensions created are not the same as the dimensions of the raster, even tho I use a Shape of the same extension.
Is there a code to equal both arrays in order to find my boundary coordinates and river coordinates inside the npArray created from the raster?
Any comment will help me, and if you think there is another way to include rasters and shapefiles to define the input packages, please let me know.
-
TypeError: 'LineString' object is not iterable
I create a dictionary with shapes from a shapefile like that
sfWholeStreets = shapefile.Reader(inputFilename) shapesWholeStreets = sfWholeStreets.shapes() recordsWholeStreets = sfWholeStreets.records() recordIndex = 0 for record in recordsWholeStreets: streetName = record[1] featureWholeStreet = sfWholeStreets.shapeRecords()[recordIndex].shape.__geo_interface__ if hasattr(featureWholeStreet, "__geo_interface__"): ob = featureWholeStreet.__geo_interface__ else: ob = featureWholeStreet linestringShapeWholeStreets = LineString(ob["coordinates"]) if streetName in streetDictionary: streetDictionary[streetName].append(record) streetShapeDictionaryWholeStreets[streetName].append(linestringShapeWholeStreets) else: streetDictionary[streetName] = [record] streetShapeDictionaryWholeStreets[streetName] = [linestringShapeWholeStreets] recordIndex = recordIndex + 1
Then when I try to save the shape to a new shapefile like below, I get the TypeError at w.line()
for record in recordsWholeStreets: w.line(streetShapeDictionaryWholeStreets[record['Name']])
-
Highlight shapefile on a ggmap
I created a map using ggmap. Here is my code:
ggmap(get_googlemap(center = c(lon = 8.3, lat = 46.5), zoom = 7, scale = 2, maptype ='satellite', size = c(640,640))) + geom_polygon(aes(x = long, y = lat, group=id), data = shape_switzerland, color ="white", fill ="orangered4", alpha = .2, size = 0.4)
Note: The
shape_switzerland
is a.shp
file which I downloaded from https://www.eea.europa.eu/data-and-maps/data/eea-reference-grids-2/gis-files/switzerland-shapefile (I took the 10km-version). I saved all the five file ("ch_10km.dbf", "ch_10km.png", "ch_10km.prj", "ch_10km.shp", ch_10km.shx") in my directory on loaded into R:shape_switzerland <- readShapeSpatial("ch_10km.shp")
So my code above should fill the shape of Switzerland in a light orange color but instead, I get the following warning messages:
Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -Inf
How can I avoid these warnings so that the shape is drawn on my map? I have tried other sources of shapefiles, but the same warnings occur and the shape is not drawn either.