RGEE and PRISM

August 21, 2020

This is the quick rundown on the rgee package that is pretty dope. The r-spatial folks have thrown down some sweet packages sf, stars, mapview (just to name a few) and now rgee. This is some pretty slick stuff. If your versed in JavaScript (JS) and wonder ‘should I waste my time with this API interface?’. Don’t worry. I’m not going to try and convince you to be a ‘R’ person just showing some of the basics and capabilities of using GEE R. So, if you just want to do your JS thing on GEE, it’s all good; however, there are some perks of doing it in R like using the tidyverse as you’ll see.

But, most of the time I run GEE with JS to explore image collections and do some basic wrangling. The main reason I like working with R is because of the feature collection aspect and wrangling that I’m so familiar with, e.g. sf and tidyverse. I’m just more comfortable, that’s all. No reason we can’t blend, i.e. GEE with R, Python, or JS. #StopThe-R-Python-Beef.

So, this is going to be a basic rundown of how to get some precipitation data from the Parameter elevation Regression on Independent Slopes Model (PRISM) and attach it to the Tobacco River watershed. What we want to do is take discharge data from the Tobacco River gage in Eureka, Montana (previous post using USGS dataRetrieval) and couple it with the PRISM data we generate using rgee. From this, we’ll have discharge and precipitation data. This will give us plenty of data to explore some common distributions in hydrology in later posts, e.g. normal, lognormal, pearson- or -log pearson and gumbel.

So, let’s dive in. You’ll need to follow the instructions on the r-spatial/rgee website to get the downlow on how to install the R-python-JS API. It’s pretty easy (ee_install()) and only takes about 10 minutes or less (I just made that up but it’s probably pretty close). One important thing is to load the library and then initialize rgee. Also, you’ll need to have an account with GEE (it’s free).

#install just once

###ee_install()

library(rgee)

ee_Initialize()

Now let’s pull in some PRISM image collections. But first we need to bring in our ‘geometry’ so that we can clip the bounds of the colletions.

library(sf)
library(tidyverse)

#bring in folders I use a lot.
lands <- "D:/R_folder/Shapes/Lands"
hydro <- "D:/R_folder/Shapes/Hydro"

#grab the gaging station location
tobacoo_river_gage <- data.frame(y = 48.87777778, x = -115.05444444) %>% st_as_sf(coords = c("x", "y")) %>% st_set_crs("EPSG:4326") %>%  st_transform("EPSG:4326")

#bring in some HUC 10s
huc10 <- read_sf(paste0(hydro, "/district_huc10.shp")) %>% st_transform("EPSG:4326")

#plot
ggplot(data = huc10) + geom_sf() + geom_sf_text(aes(label = word(Name,1)), size = 2) + geom_sf(data = tobacoo_river_gage, aes(color = "Gaging Station")) + theme_classic() + labs(color = "USGS Tobacco River")

Looks like we could get away with including ‘Fortine’ and ‘Tobacco’, but you could probably clean it up by creating a flow path grid and then extract the necessary catchments. However, we’ll just keep it simple and use the two HUC 10s as most (90%) drains into that gaging station (trust me…). To do that we need to combined those two polygons and then union so that we can then start using GEE.

polygon <- huc10 %>% filter(str_detect(word(Name,1), "Fortine|Tobacco")) %>% st_union()

Geez, can we finally start working with GEE? A lot of the stuff above could be avoided if you just want to use ‘point’ geoms or have polygons already prepped.

The Basics:

  • Grab an image or collection
  • Do some filters (e.g. time, geometry, etc.)

Let’s grab the precipitation data (‘ppt’) from the monthly prism data. This is the total monthly precip not the mean precipitation.

prism <- ee$ImageCollection("OREGONSTATE/PRISM/AN81m")$select('ppt')

Now let’s look what dates we should grab. We can look at the time frame from the Tobacco River gage; remember (last post), 1959 and 2020 are not complete from the gage so we’ll leave those out. So our date range will be 1960 to 2019.

start <- ee$Date('1960-01-01')
finish <- ee$Date('2019-01-01')

mask <- st_as_sf(polygon) %>% sf_as_ee()

tobacco_region <- mask$geometry()$bounds()

From here we can bring it in all together.

filtered_prism <- prism$filterBounds(tobacco_region)$filterDate(start, finish)

Now we can extract the monthly totals for the Tobacco River drainage. First we need to reproject the GGE product to the ‘polygon’ we made.

filtered_prism <- filtered_prism$map(function(x) x$reproject("EPSG:4326")$select("ppt"))

Now we can extract the sum of the watershed per month per year. The sum function is basically taking each pixel within this area per month per year and summing the ‘ppt’ values, which will be a lot of mm’s gettin added together so let’s grab the mean as well.

prism_tabacco <- ee_extract(x = filtered_prism, y = tobacco_region, sf = FALSE, fun = ee$Reducer$sum())

prism_tabacco_mean <- ee_extract(x = filtered_prism, y = tobacco_region, sf = FALSE, fun = ee$Reducer$mean())

Now we need to wrangle the data because it is in a long format and the variable names are long.

prism_tabacco$name <- "Tobacco"
prism_tabacco_mean$name <- "Tobacco"
library(lubridate)
library(tidyverse)

prism_mean_processed <- prism_tabacco_mean %>%
  pivot_longer(-name, names_to = "Date", values_to = "precip_mean") %>%
  mutate(Date = str_remove(Date, "X"), 
         Date = parse_date_time(Date, "ym"),
         year = year(Date), month = month(Date), day = day(Date),wy = smwrBase::waterYear(Date, TRUE))

prism_processed <- prism_tabacco %>%
  pivot_longer(-name, names_to = "Date", values_to = "precip_total") %>%
  mutate(Date = str_remove(Date, "X"), 
         Date = parse_date_time(Date, "ym"),
         year = year(Date), month = month(Date), day = day(Date), wy =  smwrBase::waterYear(Date, TRUE))

prism_processed <- left_join(prism_processed, select(prism_mean_processed, wy, precip_mean, year, month ), by = c("year", "month", "wy"))

Sweet, now we have a dataset of the same type to compare with the Tabacco River gaging data. Let’s plot the PRISM data like we did with the river data.

Well, looks like August is the lowest (for the median). Let’s look at some other graphs before we move on and let’s add decade, lustrum (5 year interval) and the overlaying decade thing.

prism_processed <- prism_processed %>% mutate(decade = year - (year %% 10), lustrum = year - (year %% 5), month_day = str_c(month,day, sep = "-"),month_day = str_c(str_sub(year, -1), month_day, sep = "-"))

These plots are actually not that bad if you can make them interactive by using plotly. The graph below will let you hover over areas and get data depending on what your eye hones in on (upper right corner has different viewing options).

Wonder what it would look like if we compared the per year discharge with the per year precipitation. Is there a relationship between the previous months precipitation and discharge… Let’s look.

Many ways to do this but we’ll keep it simple and just take the scaled values of each variable.

river_total <- tobacco_river %>% filter(!year %in% c("1958", "1959", "2019" , "2020"))%>% mutate(wy = smwrBase::waterYear(Date, TRUE)) %>% group_by(year, month) %>% summarise(total_q = sum(Flow), max_q = max(Flow), sd_q = round(sd(Flow),2))

all_tobacco <- left_join(prism_processed, river_total, by = c("year", "month"))

Alright, now we have a data set with both the precipitation totals and discharge totals. Lets just see if there is a trend between precipitation and discharge. We’ll just take about 10 years of the most recent data.

This is pretty cool. There are some great eye-ball test relationships between the two. However, we are just exploring and this would need a proper time-series analysis to see what kind of suprise we could generate. To wrap it up let’s create a table that we can filter and reference any time we want to look at certain values, month, years and so on. We can do this by using the DT package. Won’t go into too many details but here’s a quick view of our data.


For now though, let’s just end there. We’ll have some good data sets to continue on with for exploring data distributions. Have a good one!

Posted on:
August 21, 2020
Length:
6 minute read, 1269 words
Tags:
https://github.com/r-spatial/rgee
See Also:
USGS, GEE and R. What?!?