Lets load the data saved in
03_GetData_Subset_and_Combine.Rmd and examine how
connectivity is represented in the data.frame.
E<-readRDS('mayJuneEastCoast_connectivity.RDS')
You should find a data.frame E of many observations and
10 variables. Each row contains four columns which contain a single
number: nxFrom and nyFrom which define the
grid location of a single particle release location, and
lonFrom and latFrom, which give the longitude
and latitude of that release location. The next five columns contain
lists of numbers which define where particles that were released at
(nxFrom,nyFrom) and (lonFrom,latFrom) ended
up. (nxTo,nyTo) are the grid cells the particles get to,
and (lonTo,latTo) are the latitudes they get to. Now, many
particles can get to the same grid-cell. The number of particles that
get to each (nxTo,nyTo) and (lonTo,latTo)
location are stored in numTo. The final column is
numLaunched, which is the total number of particles
released from a given (nxTo,nyTo) and
(lonTo,latTo) starting location. The sum of all the
numTo in a single row should always be equal to or less
than numLaunched. The sum of numTo will be
less than numLaunched if, when computing the subset of the
connectivity data, particles that settle outside of the starting points
are discarded (trimTo=FALSE in
subsetConnectivity_*()).
Before starting to plot the connectivity data, it is useful to load
the rnaturalfeatures data that provides coastline data,
along with the simple features library sf which provides
functions to define geometric features and the ggplot2
plotting library.
library(sf)
## Linking to GEOS 3.11.0, GDAL 3.5.3, PROJ 9.1.0; sf_use_s2() is TRUE
library(ggplot2)
library("rnaturalearth")
library("rnaturalearthdata")
##
## Attaching package: 'rnaturalearthdata'
## The following object is masked from 'package:rnaturalearth':
##
## countries110
world <- ne_countries(scale = "medium", returnclass = "sf") #get coastline data
class(world)
## [1] "sf" "data.frame"
To plot all of the starting locations, plot all the (lonTo,latTo)
points. In the plot below, we zoom in on the South Atlantic Bight by
changing the limits in coord_sf to better see the
individual points.
p<-ggplot(data = world) + geom_sf() +
coord_sf(xlim= c(-82, -70), ylim = c(24, 36), expand = FALSE)+
geom_point(data = E, aes(x = lonFrom, y = latFrom), size = 1, shape = 23, fill = "green",color='black')
print(p) #this makes the figure appear
Now lets look at where all the particles released in a small region go in the 18 day drifting duration of this data. First, define a polygon that encompasses the region where the particles are released:
#define limits of a box that defines the release locations
lonLimits=c(-82.0,-80) #longitude of corners
latLimits=c(28.0,30.0) #latitude of corners
limitPoly=st_polygon(list(cbind(lonLimits[c(1,2,2,1,1)],latLimits[c(1,1,2,2,1)])))
limitPoly<-st_sfc(limitPoly,crs=4326) #make into a surface, 4326 is WGS84
Now further subset E to only include particle starting location. Set
trimTo to FALSE so we also include particles
that leave the starting locations. Because
subsetConnectivity_byPolygon() comes from the
connectivityUtilities.R, we must source
connectivityUtilities.R to load the functions in it.
#subset connectivity data
source('connectivityUtilities.R')
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ lubridate 1.9.2 ✔ tibble 3.2.1
## ✔ purrr 1.0.1 ✔ tidyr 1.3.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
##
## Attaching package: 'foreach'
##
##
## The following objects are masked from 'package:purrr':
##
## accumulate, when
##
##
## Loading required package: parallel
## [1] "file connectivityData/EZfateFiles/model_depth_and_distance.nc exists, no download"
EstartPoints<-subsetConnectivity_byPolygon(E,limitPoly,trimTo=FALSE)
Now combine each row in EstartPoints and make a
data.frame with each location particles get to, and the total number of
particles that get there.
allTo<-data.frame(lonTo=c(EstartPoints$lonTo,recursive=TRUE),
latTo=c(EstartPoints$latTo,recursive=TRUE),
numTo=c(EstartPoints$numTo,recursive=TRUE))
and them add together the total number of points that get to the same
(lonTo,latTo) points
whereAllWent<-allTo %>% group_by(lonTo,latTo) %>% summarise(numTo=sum(numTo))
numTo in whereAllWent is the number of
points that got to each destination location. It is often better to know
the fraction of points launched that go to each point.
sum(EstartPoints$numLaunched) gives the total number of
particles launched, and so we can use it to calculate the fraction of
particles that go to each point:
whereAllWent$fracTo<-whereAllWent$numTo/sum(EstartPoints$numLaunched)
and now we can plot the starting points from
(EstartPoints$lonFrom,EstartPoints$latFrom) in bright green
and the log10(density) of where particles released from
those points ended up with log10(whereAllWent$fracTo).
:
p<-ggplot(data = world) + geom_sf() +
coord_sf(xlim= c(-82, -55), ylim = c(24, 42), expand = FALSE)
#p<-p+geom_point(data = EstartPoints, aes(x = lonFrom, y = latFrom), size = 1, shape = 23, fill = "green",color='black')
p<-p+geom_point(data = whereAllWent, aes(x = lonTo, y = latTo, fill=log10(fracTo),colour=log10(fracTo)), size = 1, shape = 23)
p<-p+geom_point(data = EstartPoints, aes(x = lonFrom, y = latFrom), size = 0.5, shape = 23, fill = "green",color='green')
print(p) #this makes the figure appear
Sometimes you just want to know where particles from a particular point go. This is also straightforward. First define a latitude and longitude where you want the particles to be released from, and then calculate the closest release point in the EZfate grid.
#specify a point you are interested in
lonP<--70.600 ; latP<-42.995 #Isle of Shoals in NH and Maine
#compute squared distance between lonP and latP and starting points approximately, but
#accurately enough for this work. Answer in units of degrees lat SQUARED. This works because
#the length of a degree of longitude scales as cos(latitude)
dist<-(E$latFrom-latP)^2+((E$lonFrom-lonP)*cos(E$latFrom*2*pi/360.0))**2
#what row of E is the point of the minimum distance?
nMin<-which.min(dist)
Now lets extract the data. Note the code to extract where the
particles go, which looks like
c(E$lonTo[nMin],recursive=TRUE). This awkward code is
necessary to extract a vector from a dataFrame. The following code also
makes the data.frames to point.
#obtain latitude and longitude where particles released from (lonP,latP) have gone
plotLon<-c(E$lonTo[nMin],recursive=TRUE)
plotLat<-c(E$latTo[nMin],recursive=TRUE)
numTo<-c(E$numTo[nMin],recursive=TRUE)
#make data frames to plot
startP=data.frame(lonP=lonP,latP=latP)
endP=data.frame(plotLon=plotLon,plotLat=plotLat,numTo=numTo)
Now plot where the particles started from and where they go.
p<-ggplot(data = world) + geom_sf() +
coord_sf(xlim= c(-72, -64), ylim = c(41, 44.2), expand = FALSE)+
geom_point(data = endP, aes(x = plotLon, y = plotLat,color=numTo), size = 1)+
geom_point(data=startP, aes(x = lonP, y = latP), size = 1, shape = 23, fill = "red",color='red')+
labs(title='Red is starting point, shaded blue points are ending locations')
print(p) #this makes the figure appear
Often we will want to model the dispersal of an organism over multiple generations, or something similar. Let us, for now, imagine modeling a semelparous organism, and imagine it starts at a specific point, and each generation it will release a certain number of larvae. Where will those larvae end up, and how many will make it?
In the next section connectivity data will be used to quantitatively examine the spread of an organism from an initial location over multiple generations.