Contents

Excercises

UW Geospatial Data Analysis
CEE498/CEWA599
David Shean

Objectives

  • Explore spatial and temporal relationships of time series data, collected by networks of in-situ stations

  • Learn about dynamic API queries, data ingestion into Pandas/GeoPandas

  • Working with Pandas Timestamp and Python DateTime objects

  • Explore spatial correlation of time series records

  • Explore some simple interpolation routines to create continuous gridded values from sparse points

  • Explore some fundamental concepts and metrics for snow science

  • Visualize recent snow accumulation in your region

Discussion

Read a bit about SNOTEL data for the Western U.S.

https://www.wcc.nrcs.usda.gov/snow/

This is actually a nice web interface, with some advanced querying and interactive visualization. You can also download formatted ASCII files (csv) for analysis. This is great for one-time projects, but it’s nice to have deterministic code that can be updated as new data appear, without manual steps. That’s what we’re going to do here.

Sample plots for SNOTEL site at Paradise, WA (south side of Mt. Rainier)

CUAHSI WOF server and automated Python data queries

We are going to use a server set up by CUAHSI to serve the SNOTEL data, using a standardized database storage format and query structure. You don’t need to worry about this, but please quickly review the following:

Acronym soup

  • SNOTEL = Snow Telemetry

  • CUAHSI = Consortium of Universities for the Advancement of Hydrologic Science, Inc

  • WOF = WaterOneFlow

  • WSDL = Web Services Description Language

  • USDA = United States Department of Agriculture

  • NRCS = National Resources Conservation Service

  • AWDB = Air-Water Database

Python options

There are a few packages out there that offer convenience functions to query the online SNOTEL databases and unpack the results.

You can also write your own queries using the Python requests module and some built-in XML parsing libraries.

Hopefully not overwhelming amount of information - let’s just go with ulmo for now. I’ve done most of the work to prepare functions for querying and processing the data. Once you wrap your head around all of the acronyms, it’s pretty simple, basically running a few functions here: https://ulmo.readthedocs.io/en/latest/api.html#module-ulmo.cuahsi.wof

We will use ulmo with daily data for this exercise, but please feel free to experiment with hourly data, other variables or other approaches to fetch SNOTEL data.

Important ulmo installation note

We’re going to use the latest development version of ulmo, straight from the github source! This is a good exercise, and will show you how to install a package directly from source code on github.

#Install directly from github repo main branch
%pip install git+https://github.com/ulmo-dev/ulmo.git
Collecting git+https://github.com/ulmo-dev/ulmo.git
  Cloning https://github.com/ulmo-dev/ulmo.git to /tmp/pip-req-build-6m1aaxxf
Requirement already satisfied: appdirs in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (1.4.4)
Requirement already satisfied: beautifulsoup4 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (4.9.3)
Requirement already satisfied: future in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (0.18.2)
Requirement already satisfied: geojson in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (2.5.0)
Requirement already satisfied: isodate in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (0.6.0)
Requirement already satisfied: lxml in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (4.6.2)
Requirement already satisfied: numpy in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (1.19.4)
Requirement already satisfied: pandas<1.1 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (1.0.5)
Requirement already satisfied: tables in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (3.6.1)
Requirement already satisfied: requests in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (2.25.1)
Requirement already satisfied: suds-jurko in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (0.6)
Requirement already satisfied: html5lib<=0.9999999 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from ulmo==0.8.7.dev0) (0.9999999)
Requirement already satisfied: six in /srv/conda/envs/notebook/lib/python3.8/site-packages (from html5lib<=0.9999999->ulmo==0.8.7.dev0) (1.15.0)
Requirement already satisfied: python-dateutil>=2.6.1 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from pandas<1.1->ulmo==0.8.7.dev0) (2.7.5)
Requirement already satisfied: pytz>=2017.2 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from pandas<1.1->ulmo==0.8.7.dev0) (2020.4)
Requirement already satisfied: soupsieve>1.2 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from beautifulsoup4->ulmo==0.8.7.dev0) (2.2)
Requirement already satisfied: certifi>=2017.4.17 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from requests->ulmo==0.8.7.dev0) (2020.12.5)
Requirement already satisfied: idna<3,>=2.5 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from requests->ulmo==0.8.7.dev0) (2.10)
Requirement already satisfied: chardet<5,>=3.0.2 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from requests->ulmo==0.8.7.dev0) (3.0.4)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from requests->ulmo==0.8.7.dev0) (1.25.11)
Requirement already satisfied: numexpr>=2.6.2 in /srv/conda/envs/notebook/lib/python3.8/site-packages (from tables->ulmo==0.8.7.dev0) (2.7.2)
Note: you may need to restart the kernel to use updated packages.
import ulmo

Interactive Demo

Import necessary modules

import os
from datetime import datetime
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import folium
import contextily as ctx
import scipy.stats

Load state polygons for later use

states_url = 'http://eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_040_00_5m.json'
#states_url = 'http://eric.clst.org/assets/wiki/uploads/Stuff/gz_2010_us_040_00_500k.json'
states_gdf = gpd.read_file(states_url)

CUAHSI WOF server information

  • Try typing this in a browser, note what you get back

#http://his.cuahsi.org/wofws.html
wsdlurl = 'http://hydroportal.cuahsi.org/Snotel/cuahsi_1_1.asmx?WSDL'
#wsdlurl = "http://worldwater.byu.edu/interactive/snotel/services/index.php/cuahsi_1_1.asmx?WSDL"  # WOF 1.1

Part 1: Query and analyze SNOTEL site spatial information

  • Use the ulmo cuahsi interface and the get_sites function to fetch data from server

sites = ulmo.cuahsi.wof.get_sites(wsdlurl)

If you get an error (AttributeError: module 'ulmo' has no attribute 'cuahsi'), you need to restart your kernel and “Run All Above Selected Cell”

#Preview first item in dictionary
next(iter(sites.items()))
('SNOTEL:301_CA_SNTL',
 {'code': '301_CA_SNTL',
  'name': 'Adin Mtn',
  'network': 'SNOTEL',
  'location': {'latitude': '41.2358283996582',
   'longitude': '-120.79192352294922'},
  'elevation_m': '1886.7120361328125',
  'site_property': {'county': 'Modoc',
   'state': 'California',
   'site_comments': 'beginDate=10/1/1983 12:00:00 AM|endDate=1/1/2100 12:00:00 AM|HUC=180200021403|HUD=18020002|TimeZone=-8.0|actonId=20H13S|shefId=ADMC1|stationTriplet=301:CA:SNTL|isActive=True',
   'pos_accuracy_m': '0'}})

Take a moment to inspect the output

  • What is the type?

  • Note data structure and different key/value pairs for each site

Store as a Pandas DataFrame called sites_df

  • See the from_dict function

  • Use orient option so the sites comprise the DataFrame index, with columns for ‘name’, ‘elevation_m’, etc

  • Use the dropna method to remove any empty records

sites_df = pd.DataFrame.from_dict(sites, orient='index').dropna()
sites_df.head()
code name network location elevation_m site_property
SNOTEL:301_CA_SNTL 301_CA_SNTL Adin Mtn SNOTEL {'latitude': '41.2358283996582', 'longitude': ... 1886.7120361328125 {'county': 'Modoc', 'state': 'California', 'si...
SNOTEL:907_UT_SNTL 907_UT_SNTL Agua Canyon SNOTEL {'latitude': '37.522171020507813', 'longitude'... 2712.719970703125 {'county': 'Kane', 'state': 'Utah', 'site_comm...
SNOTEL:916_MT_SNTL 916_MT_SNTL Albro Lake SNOTEL {'latitude': '45.59722900390625', 'longitude':... 2529.840087890625 {'county': 'Madison', 'state': 'Montana', 'sit...
SNOTEL:1267_AK_SNTL 1267_AK_SNTL Alexander Lake SNOTEL {'latitude': '61.749668121337891', 'longitude'... 48.768001556396484 {'county': 'Matanuska-Susitna', 'state': 'Alas...
SNOTEL:908_WA_SNTL 908_WA_SNTL Alpine Meadows SNOTEL {'latitude': '47.779571533203125', 'longitude'... 1066.800048828125 {'county': 'King', 'state': 'Washington', 'sit...

Clean up the DataFrame and prepare Point geometry objects

  • We covered this with the GLAS data conversion to GeoPandas

  • Convert 'location' column (contains dictionary with 'latitude' and 'longitude' values) to Shapely Point objects

  • Store as a new 'geometry' column

  • Drop the 'location' column, as this is no longer needed

  • Update the dtype of the 'elevation_m' column to float

sites_df['geometry'] = [Point(float(loc['longitude']), float(loc['latitude'])) for loc in sites_df['location']]
sites_df = sites_df.drop(columns='location')
sites_df = sites_df.astype({"elevation_m":float})

Review output

  • Take a moment to familiarize yourself with the DataFrame structure and different columns.

  • Note that the index is a set of strings with format ‘SNOTEL:1000_OR_SNTL’

  • Extract the first record with loc

    • Review the 'site_property' dictionary - could parse this and store as separate fields in the DataFrame if desired

sites_df.head()
code name network elevation_m site_property geometry
SNOTEL:301_CA_SNTL 301_CA_SNTL Adin Mtn SNOTEL 1886.712036 {'county': 'Modoc', 'state': 'California', 'si... POINT (-120.7919235229492 41.2358283996582)
SNOTEL:907_UT_SNTL 907_UT_SNTL Agua Canyon SNOTEL 2712.719971 {'county': 'Kane', 'state': 'Utah', 'site_comm... POINT (-112.2711791992188 37.52217102050781)
SNOTEL:916_MT_SNTL 916_MT_SNTL Albro Lake SNOTEL 2529.840088 {'county': 'Madison', 'state': 'Montana', 'sit... POINT (-111.9590225219727 45.59722900390625)
SNOTEL:1267_AK_SNTL 1267_AK_SNTL Alexander Lake SNOTEL 48.768002 {'county': 'Matanuska-Susitna', 'state': 'Alas... POINT (-150.8896636962891 61.74966812133789)
SNOTEL:908_WA_SNTL 908_WA_SNTL Alpine Meadows SNOTEL 1066.800049 {'county': 'King', 'state': 'Washington', 'sit... POINT (-121.6984710693359 47.77957153320312)
sites_df.loc['SNOTEL:301_CA_SNTL']
code                                                   301_CA_SNTL
name                                                      Adin Mtn
network                                                     SNOTEL
elevation_m                                                1886.71
site_property    {'county': 'Modoc', 'state': 'California', 'si...
geometry               POINT (-120.7919235229492 41.2358283996582)
Name: SNOTEL:301_CA_SNTL, dtype: object
sites_df.loc['SNOTEL:301_CA_SNTL']['site_property']
{'county': 'Modoc',
 'state': 'California',
 'site_comments': 'beginDate=10/1/1983 12:00:00 AM|endDate=1/1/2100 12:00:00 AM|HUC=180200021403|HUD=18020002|TimeZone=-8.0|actonId=20H13S|shefId=ADMC1|stationTriplet=301:CA:SNTL|isActive=True',
 'pos_accuracy_m': '0'}

Convert to a Geopandas GeoDataFrame

  • We already have 'geometry' column, but still need to define the crs

  • Note the number of records

sites_gdf_all = gpd.GeoDataFrame(sites_df, crs='EPSG:4326')
sites_gdf_all.shape
(930, 6)
sites_gdf_all.head()
code name network elevation_m site_property geometry
SNOTEL:301_CA_SNTL 301_CA_SNTL Adin Mtn SNOTEL 1886.712036 {'county': 'Modoc', 'state': 'California', 'si... POINT (-120.79192 41.23583)
SNOTEL:907_UT_SNTL 907_UT_SNTL Agua Canyon SNOTEL 2712.719971 {'county': 'Kane', 'state': 'Utah', 'site_comm... POINT (-112.27118 37.52217)
SNOTEL:916_MT_SNTL 916_MT_SNTL Albro Lake SNOTEL 2529.840088 {'county': 'Madison', 'state': 'Montana', 'sit... POINT (-111.95902 45.59723)
SNOTEL:1267_AK_SNTL 1267_AK_SNTL Alexander Lake SNOTEL 48.768002 {'county': 'Matanuska-Susitna', 'state': 'Alas... POINT (-150.88966 61.74967)
SNOTEL:908_WA_SNTL 908_WA_SNTL Alpine Meadows SNOTEL 1066.800049 {'county': 'King', 'state': 'Washington', 'sit... POINT (-121.69847 47.77957)

Create a scatterplot showing elevation values for all sites

f, ax = plt.subplots(figsize=(10,6))
sites_gdf_all.plot(ax=ax, column='elevation_m', markersize=3, cmap='inferno', legend=True)
ax.autoscale(False)
states_gdf.plot(ax=ax, facecolor='none', edgecolor='k', alpha=0.3);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_33_0.png

Exclude the Alaska (AK) points to isolate points over Western U.S.

  • Can remove points where the site name contains ‘AK’ or can use a spatial filter (see GeoPandas cx indexer functionality)

  • Note the number of records

sites_gdf_all = sites_gdf_all[~(sites_gdf_all.index.str.contains('AK'))]
#print(sites_gdf_all.unary_union.centroid)
#xmin, xmax, ymin, ymax = [-126, 102, 30, 50]
#sites_gdf_all = sites_gdf_all.cx[xmin:xmax, ymin:ymax]
sites_gdf_all.shape
(865, 6)

Update your scatterplot as sanity check

  • Should look something like the Western U.S.

f, ax = plt.subplots(figsize=(10,6))
sites_gdf_all.plot(ax=ax, column='elevation_m', markersize=3, cmap='inferno', legend=True)
ax.autoscale(False)
states_gdf.plot(ax=ax, facecolor='none', edgecolor='k', alpha=0.3);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_37_0.png

Export SNOTEL site GeoDataFrame as a geojson

  • Maybe useful for later lab or other analysis

sites_fn = 'snotel_conus_sites.json'
if not os.path.exists(sites_fn):
    sites_gdf_all.to_file(sites_fn, driver='GeoJSON')

Interactive plot using folium

This was extra credit on the Vector2 lab.

See the example here: https://python-visualization.github.io/folium/docs-v0.6.0/quickstart.html

For your plot:

  • Compute the centroid of the SNOTEL sites (remember GeoPandas unary_union)

  • Create a folium map object centered on this centroid

    • Use the ‘Stamen Terrain’ basemap layer

    • Experiment with zoom_start level to find a good extent

  • Export your GeoDataframe using to_json(), then load all features using folium.features.GeoJson

  • Add the points to the map

Take a moment to explore this interactive map interface.

c = sites_gdf_all.unary_union.centroid
lat = c.y
lon = c.x

Create a folium map with clustering

import folium
from folium.plugins import MarkerCluster
m = folium.Map(location=[lat, lon], tiles='Stamen Terrain', zoom_start=5)
#Create clustered map with popups
locations, popups = [], []
for idx,row in sites_gdf_all.iterrows():
    locations.append([row['geometry'].y, row['geometry'].x])
    popups.append(idx)
t = folium.FeatureGroup(name='SNOTEL')
t.add_child(MarkerCluster(locations=locations, popups=popups))
m.add_child(t)

Create a histogram of SNOTEL site elevations

  • What is the highest SNOTEL site in the Western U.S.?

  • Thought question: Do these elevations seem to provide a good sample of elevations where we expect snow to accumulate?

sites_gdf_all.hist('elevation_m', bins=128);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_45_0.png
sitecode = sites_gdf_all['elevation_m'].idxmax()
sites_gdf_all.loc[sitecode]
code                                                  1058_CO_SNTL
name                                                      Grayback
network                                                     SNOTEL
elevation_m                                                3541.78
site_property    {'county': 'Rio Grande', 'state': 'Colorado', ...
geometry              POINT (-106.5378265380859 37.47032928466797)
Name: SNOTEL:1058_CO_SNTL, dtype: object

Reproject the sites GeoDataFrame

  • Can use the same Albers Equal Area projection from previous labs, or recompute based on bounds and center of SNOTEL sites

aea_proj = '+proj=aea +lat_1=37.00 +lat_2=47.00 +lat_0=42.00 +lon_0=-114.27'
sites_gdf_all_proj = sites_gdf_all.to_crs(aea_proj)
#Reproject states
states_gdf_proj = states_gdf.to_crs(aea_proj)
#Isolate WA state polygon
wa_state = states_gdf_proj.loc[states_gdf_proj['NAME'] == 'Washington']

Create a scatterplot and overlay the state polygons

f, ax = plt.subplots(figsize=(10,6))
sites_gdf_all_proj.plot(ax=ax, column='elevation_m', markersize=3, cmap='inferno', legend=True, legend_kwds={'label':'Elevation (m)'})
ax.autoscale(False)
states_gdf_proj.plot(ax=ax, facecolor='none', edgecolor='k', alpha=0.3);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_53_0.png

Isolate WA sites

  • As with the GLAS point example, we could do intersects or a spatial join with WA polygon

  • But probably easiest to filter records with ‘WA’ in the index

    • Note: need to convert the SNOTEL DataFrame index to str

    • contains might be a nice option

  • Sanity check - note number of records and create a quick scatterplot to verify

wa_idx = sites_gdf_all_proj.index.str.contains('WA')
sites_gdf_wa = sites_gdf_all_proj[wa_idx]
sites_gdf_wa.shape
(84, 6)
#Prepare list of WA stations for use later in lab
#Can preserve as Pandas Index object
wa_stations = sites_gdf_all_proj.index[wa_idx]
#Or convert to list, if desired
#wa_stations = list(sites_gdf_all_proj.index[wa_idx])
wa_stations
Index(['SNOTEL:908_WA_SNTL', 'SNOTEL:990_WA_SNTL', 'SNOTEL:352_WA_SNTL',
       'SNOTEL:1080_WA_SNTL', 'SNOTEL:1107_WA_SNTL', 'SNOTEL:375_WA_SNTL',
       'SNOTEL:376_WA_SNTL', 'SNOTEL:942_WA_SNTL', 'SNOTEL:1109_WA_SNTL',
       'SNOTEL:1085_WA_SNTL', 'SNOTEL:418_WA_SNTL', 'SNOTEL:420_WA_SNTL',
       'SNOTEL:943_WA_SNTL', 'SNOTEL:998_WA_SNTL', 'SNOTEL:910_WA_SNTL',
       'SNOTEL:995_WA_SNTL', 'SNOTEL:994_WA_SNTL', 'SNOTEL:1004_WA_SNTL',
       'SNOTEL:478_WA_SNTL', 'SNOTEL:1159_WA_SNTL', 'SNOTEL:1256_WA_SNTL',
       'SNOTEL:502_WA_SNTL', 'SNOTEL:507_WA_SNTL', 'SNOTEL:515_WA_SNTL',
       'SNOTEL:991_WA_SNTL', 'SNOTEL:928_WA_SNTL', 'SNOTEL:1129_WA_SNTL',
       'SNOTEL:553_WA_SNTL', 'SNOTEL:591_WA_SNTL', 'SNOTEL:599_WA_SNTL',
       'SNOTEL:606_WA_SNTL', 'SNOTEL:1069_WA_SNTL', 'SNOTEL:999_WA_SNTL',
       'SNOTEL:897_WA_SNTL', 'SNOTEL:1011_WA_SNTL', 'SNOTEL:630_WA_SNTL',
       'SNOTEL:262_WA_SNTL', 'SNOTEL:642_WA_SNTL', 'SNOTEL:644_WA_SNTL',
       'SNOTEL:648_WA_SNTL', 'SNOTEL:898_WA_SNTL', 'SNOTEL:941_WA_SNTL',
       'SNOTEL:1259_WA_SNTL', 'SNOTEL:996_WA_SNTL', 'SNOTEL:672_WA_SNTL',
       'SNOTEL:679_WA_SNTL', 'SNOTEL:681_WA_SNTL', 'SNOTEL:1104_WA_SNTL',
       'SNOTEL:692_WA_SNTL', 'SNOTEL:1263_WA_SNTL', 'SNOTEL:263_WA_SNTL',
       'SNOTEL:699_WA_SNTL', 'SNOTEL:702_WA_SNTL', 'SNOTEL:707_WA_SNTL',
       'SNOTEL:711_WA_SNTL', 'SNOTEL:911_WA_SNTL', 'SNOTEL:728_WA_SNTL',
       'SNOTEL:734_WA_SNTL', 'SNOTEL:1231_WA_SNTL', 'SNOTEL:1068_WA_SNTL',
       'SNOTEL:1043_WA_SNTL', 'SNOTEL:748_WA_SNTL', 'SNOTEL:1257_WA_SNTL',
       'SNOTEL:912_WA_SNTL', 'SNOTEL:985_WA_SNTL', 'SNOTEL:776_WA_SNTL',
       'SNOTEL:777_WA_SNTL', 'SNOTEL:984_WA_SNTL', 'SNOTEL:788_WA_SNTL',
       'SNOTEL:791_WA_SNTL', 'SNOTEL:1157_WA_SNTL', 'SNOTEL:264_WA_SNTL',
       'SNOTEL:804_WA_SNTL', 'SNOTEL:975_WA_SNTL', 'SNOTEL:1012_WA_SNTL',
       'SNOTEL:817_WA_SNTL', 'SNOTEL:899_WA_SNTL', 'SNOTEL:824_WA_SNTL',
       'SNOTEL:1171_WA_SNTL', 'SNOTEL:832_WA_SNTL', 'SNOTEL:841_WA_SNTL',
       'SNOTEL:974_WA_SNTL', 'SNOTEL:909_WA_SNTL', 'SNOTEL:863_WA_SNTL'],
      dtype='object')
#Prepare a list of WA stations above a predefined elevation threshold
high_thresh = 1400 #meters
wa_stations_high = sites_gdf_wa.index[sites_gdf_wa['elevation_m'] > high_thresh]
wa_stations_high
Index(['SNOTEL:1080_WA_SNTL', 'SNOTEL:1107_WA_SNTL', 'SNOTEL:375_WA_SNTL',
       'SNOTEL:376_WA_SNTL', 'SNOTEL:1085_WA_SNTL', 'SNOTEL:418_WA_SNTL',
       'SNOTEL:998_WA_SNTL', 'SNOTEL:1004_WA_SNTL', 'SNOTEL:1159_WA_SNTL',
       'SNOTEL:502_WA_SNTL', 'SNOTEL:507_WA_SNTL', 'SNOTEL:515_WA_SNTL',
       'SNOTEL:1129_WA_SNTL', 'SNOTEL:599_WA_SNTL', 'SNOTEL:606_WA_SNTL',
       'SNOTEL:1011_WA_SNTL', 'SNOTEL:630_WA_SNTL', 'SNOTEL:262_WA_SNTL',
       'SNOTEL:642_WA_SNTL', 'SNOTEL:644_WA_SNTL', 'SNOTEL:679_WA_SNTL',
       'SNOTEL:681_WA_SNTL', 'SNOTEL:692_WA_SNTL', 'SNOTEL:707_WA_SNTL',
       'SNOTEL:711_WA_SNTL', 'SNOTEL:1068_WA_SNTL', 'SNOTEL:1043_WA_SNTL',
       'SNOTEL:984_WA_SNTL', 'SNOTEL:264_WA_SNTL', 'SNOTEL:824_WA_SNTL',
       'SNOTEL:832_WA_SNTL', 'SNOTEL:974_WA_SNTL'],
      dtype='object')

Create scatterplot to verify and add contextily basemap

  • Can specify our AEA crs to the crs keyword in the ctx add_basemap function to reproject on the fly

f, ax = plt.subplots(figsize=(10,6))
wa_state.plot(ax=ax, facecolor='none', edgecolor='black')
sites_gdf_wa.plot(ax=ax, column='elevation_m', markersize=20, edgecolor='k', cmap='inferno', \
                  legend=True, legend_kwds={'label':'Elevation (m)'})
ctx.add_basemap(ax=ax, crs=sites_gdf_wa.crs, source=ctx.providers.Stamen.Terrain)
ax.set_title('All WA SNOTEL Stations');
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_59_0.png
f, ax = plt.subplots(figsize=(10,6))
wa_state.plot(ax=ax, facecolor='none', edgecolor='black')
sites_gdf_wa.loc[wa_stations_high].plot(ax=ax, column='elevation_m', markersize=20, edgecolor='k', \
                                        cmap='inferno', legend=True, legend_kwds={'label':'Elevation (m)'})
ctx.add_basemap(ax=ax, crs=sites_gdf_wa.crs, source=ctx.providers.Stamen.Terrain)
ax.set_title('All WA SNOTEL Stations > %0.0f m' % high_thresh);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_60_0.png

Create a histogram plot showing elevation of all SNOTEL sites in Western US and the WA sample

  • These should be two histograms on the same axes

  • Discussion question: What do you notice about the WA sample?

ax = sites_gdf_all.hist('elevation_m', bins=128)
sites_gdf_wa.hist('elevation_m', bins=64, ax=ax);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_62_0.png

Identify the highest site in WA

  • Assign the index to a variable named sitecode

sitecode_max = sites_gdf_wa['elevation_m'].idxmax()
sitecode = sitecode_max
sites_gdf_wa.loc[sitecode]
code                                                   515_WA_SNTL
name                                                    Harts Pass
network                                                     SNOTEL
elevation_m                                                1978.15
site_property    {'county': 'Okanogan', 'state': 'Washington', ...
geometry              POINT (-471261.4818625616 765568.9706388527)
Name: SNOTEL:515_WA_SNTL, dtype: object

Query the server for information about this site

  • Use the ulmo cuahsi get_site_info method here

  • Lots of output here, try to skim and get a sense of the different parameters and associated metadata

site_info = ulmo.cuahsi.wof.get_site_info(wsdlurl, sitecode)
#site_info

Inspect the returned information

  • Note number of available variables in the time series data!

dict_keys = site_info['series'].keys()
dict_keys
dict_keys(['SNOTEL:BATT_D', 'SNOTEL:BATT_H', 'SNOTEL:PRCP_y', 'SNOTEL:PRCP_sm', 'SNOTEL:PRCP_m', 'SNOTEL:PRCP_wy', 'SNOTEL:PRCPSA_y', 'SNOTEL:PRCPSA_D', 'SNOTEL:PRCPSA_sm', 'SNOTEL:PRCPSA_m', 'SNOTEL:PRCPSA_wy', 'SNOTEL:PREC_sm', 'SNOTEL:PREC_m', 'SNOTEL:PREC_wy', 'SNOTEL:PVPV_H', 'SNOTEL:RHUMV_H', 'SNOTEL:SNWD_D', 'SNOTEL:SNWD_sm', 'SNOTEL:SNWD_H', 'SNOTEL:SNWD_m', 'SNOTEL:SRAD_D', 'SNOTEL:SRAD_H', 'SNOTEL:SRADN_D', 'SNOTEL:SRADN_sm', 'SNOTEL:SRADN_m', 'SNOTEL:SRADV_D', 'SNOTEL:SRADV_sm', 'SNOTEL:SRADV_H', 'SNOTEL:SRADV_m', 'SNOTEL:SRADX_D', 'SNOTEL:SRADX_sm', 'SNOTEL:SRADX_H', 'SNOTEL:SRADX_m', 'SNOTEL:TAVG_y', 'SNOTEL:TAVG_D', 'SNOTEL:TAVG_sm', 'SNOTEL:TAVG_m', 'SNOTEL:TAVG_wy', 'SNOTEL:TAVG_H', 'SNOTEL:TMAX_y', 'SNOTEL:TMAX_D', 'SNOTEL:TMAX_sm', 'SNOTEL:TMAX_m', 'SNOTEL:TMAX_wy', 'SNOTEL:TMIN_y', 'SNOTEL:TMIN_D', 'SNOTEL:TMIN_sm', 'SNOTEL:TMIN_m', 'SNOTEL:TMIN_wy', 'SNOTEL:TOBS_D', 'SNOTEL:TOBS_sm', 'SNOTEL:TOBS_H', 'SNOTEL:TOBS_m', 'SNOTEL:WDIRV_H', 'SNOTEL:WSPDV_H', 'SNOTEL:WSPDX_H', 'SNOTEL:WTEQ_D', 'SNOTEL:WTEQ_sm', 'SNOTEL:WTEQ_H', 'SNOTEL:WTEQ_m'])
len(dict_keys)
60

Part 2: Time series analysis for one station

Let’s consider the ‘SNOTEL:SNWD_D’ variable (Daily Snow Depth)

  • Assign ‘SNOTEL:SNWD_D’ to a variable named variablecode

  • Get some information about the variable using get_variable_info method

  • Note the units, nodata value, etc.

#Daily SWE
#variablecode = 'SNOTEL:WTEQ_D'
#Daily snow depth
variablecode = 'SNOTEL:SNWD_D'
#Hourly SWE
#variablecode = 'SNOTEL:WTEQ_H'
#Hourly snow depth
#variablecode = 'SNOTEL:SNWD_H'
ulmo.cuahsi.wof.get_variable_info(wsdlurl, variablecode)
{'value_type': 'Field Observation',
 'data_type': 'Continuous',
 'general_category': 'Soil',
 'sample_medium': 'Snow',
 'no_data_value': '-9999',
 'speciation': 'Not Applicable',
 'code': 'SNWD_D',
 'id': '176',
 'name': 'Snow depth',
 'vocabulary': 'SNOTEL',
 'time': {'is_regular': True,
  'interval': '1',
  'units': {'abbreviation': 'd',
   'code': '104',
   'name': 'day',
   'type': 'Time'}},
 'units': {'abbreviation': 'in',
  'code': '49',
  'name': 'international inch',
  'type': 'Length'}}

Define a function to fetch data

  • I’ve done this for you, but please review the comments and steps to see what is going on under the hood

  • You’ll probably have to do similar data wrangling for another project at some point in the future

#Get current datetime
today = datetime.today().strftime('%Y-%m-%d')

def fetch(sitecode, variablecode='SNOTEL:SNWD_D', start_date='1950-10-01', end_date=today):
    #print(sitecode, variablecode, start_date, end_date)
    values_df = None
    try:
        #Request data from the server
        site_values = ulmo.cuahsi.wof.get_values(wsdlurl, sitecode, variablecode, start=start_date, end=end_date)
        #Convert to a Pandas DataFrame   
        values_df = pd.DataFrame.from_dict(site_values['values'])
        #Parse the datetime values to Pandas Timestamp objects
        values_df['datetime'] = pd.to_datetime(values_df['datetime'], utc=True)
        #Set the DataFrame index to the Timestamps
        values_df = values_df.set_index('datetime')
        #Convert values to float and replace -9999 nodata values with NaN
        values_df['value'] = pd.to_numeric(values_df['value']).replace(-9999, np.nan)
        #Remove any records flagged with lower quality
        values_df = values_df[values_df['quality_control_level_code'] == '1']
    except:
        print("Unable to fetch %s" % variablecode)

    return values_df

Use this function to get the full ‘SNOTEL:SNWD_D’ record for one station

  • Can use Paradise station (SNOTEL:679_WA_SNTL), the highest station in WA (from above), or go back up to your folium plot with clusters/labels and pick a site of your choosing!

  • Inspect the results

  • We used a dummy start date of Jan 1, 1950. What is the actual the first date returned?

#Hart's Pass
#sitecode = 'SNOTEL:515_WA_SNTL'
#Paradise
sitecode = 'SNOTEL:679_WA_SNTL'
start_date = datetime(1950,1,1)
end_date = datetime.today()

print(sitecode)
values_df = fetch(sitecode, variablecode, start_date, end_date)
#values_df = fetch(sitecode, variablecode)
values_df.tail()
SNOTEL:679_WA_SNTL
value qualifiers censor_code date_time_utc method_id method_code source_code quality_control_level_code
datetime
2021-02-21 00:00:00+00:00 171.0 V nc 2021-02-21T00:00:00 0 0 1 1
2021-02-22 00:00:00+00:00 171.0 V nc 2021-02-22T00:00:00 0 0 1 1
2021-02-23 00:00:00+00:00 188.0 E nc 2021-02-23T00:00:00 0 0 1 1
2021-02-24 00:00:00+00:00 195.0 V nc 2021-02-24T00:00:00 0 0 1 1
2021-02-25 00:00:00+00:00 193.0 V nc 2021-02-25T00:00:00 0 0 1 1
#Get number of decimal years between first and last observation
nyears = (values_df.index.max() - values_df.index.min()).days/365.25
nyears
14.524298425735797

Create a quick plot to view the time series

  • Take a moment to inspect the value column, which is where the SNWD_D values are stored

  • Sanity check thought question: What are the units again?

values_df.plot();
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_82_0.png

Compute the integer day of year (doy) and integer day of water year (dowy)

#Add DOY and DOWY column
#Need to revisit for leap year support
def add_dowy(df, col=None):
    if col is None:
        df['doy'] = df.index.dayofyear
    else:
        df['doy'] = df[col].dayofyear
    #df['dowy'] = (df['doy'].index - pd.DateOffset(months=9)).dayofyear
    # Sept 30 is doy 273
    df['dowy'] = df['doy'] - 273
    df.loc[df['dowy'] <= 0, 'dowy'] += 365
add_dowy(values_df)
#Define variable to store current year
curr_y = datetime.now().year
curr_y
2021
#Quick sanity check around beginning of WY
values_df[f'{curr_y-1}-09-28':f'{curr_y-1}-10-03']
value qualifiers censor_code date_time_utc method_id method_code source_code quality_control_level_code doy dowy
datetime
2020-09-28 00:00:00+00:00 0.0 E nc 2020-09-28T00:00:00 0 0 1 1 272 364
2020-09-29 00:00:00+00:00 0.0 E nc 2020-09-29T00:00:00 0 0 1 1 273 365
2020-09-30 00:00:00+00:00 0.0 E nc 2020-09-30T00:00:00 0 0 1 1 274 1
2020-10-01 00:00:00+00:00 0.0 E nc 2020-10-01T00:00:00 0 0 1 1 275 2
2020-10-02 00:00:00+00:00 0.0 V nc 2020-10-02T00:00:00 0 0 1 1 276 3
2020-10-03 00:00:00+00:00 0.0 E nc 2020-10-03T00:00:00 0 0 1 1 277 4
values_df[f'{curr_y-1}-12-29':f'{curr_y}-01-03']
value qualifiers censor_code date_time_utc method_id method_code source_code quality_control_level_code doy dowy
datetime
2020-12-29 00:00:00+00:00 72.0 V nc 2020-12-29T00:00:00 0 0 1 1 364 91
2020-12-30 00:00:00+00:00 75.0 E nc 2020-12-30T00:00:00 0 0 1 1 365 92
2020-12-31 00:00:00+00:00 85.0 E nc 2020-12-31T00:00:00 0 0 1 1 366 93
2021-01-01 00:00:00+00:00 83.0 V nc 2021-01-01T00:00:00 0 0 1 1 1 93
2021-01-02 00:00:00+00:00 90.0 E nc 2021-01-02T00:00:00 0 0 1 1 2 94
2021-01-03 00:00:00+00:00 100.0 E nc 2021-01-03T00:00:00 0 0 1 1 3 95

Compute statistics for each day of water year, using values from all years

  • Seems like a Pandas groupby/agg might work here

  • Stats should at least include min, max, mean, and median

stat_list = ['count','min','max','mean','std','median','mad']
doy_stats = values_df.groupby('dowy').agg(stat_list)['value']
doy_stats
count min max mean std median mad
dowy
1 15 0.0 9.0 0.666667 2.319688 0.0 1.155556
2 15 0.0 8.0 0.533333 2.065591 0.0 0.995556
3 15 0.0 9.0 0.666667 2.319688 0.0 1.155556
4 15 0.0 5.0 0.600000 1.404076 0.0 0.960000
5 15 0.0 5.0 0.600000 1.454058 0.0 0.960000
... ... ... ... ... ... ... ...
361 15 0.0 1.0 0.066667 0.258199 0.0 0.124444
362 15 0.0 1.0 0.066667 0.258199 0.0 0.124444
363 15 0.0 1.0 0.066667 0.258199 0.0 0.124444
364 15 0.0 1.0 0.066667 0.258199 0.0 0.124444
365 15 0.0 3.0 0.266667 0.798809 0.0 0.462222

365 rows × 7 columns

Create a plot of these aggregated dowy values

  • Your output independent variable (x-axis) should be day of water year (1-366), and dependent variable (y-axis) should be median value for that day of year, computed using values from all available years

    • You may have to explicitly specify the x and y valuese for your plot function

  • Something like the 30-year mean and median here: https://www.nwrfc.noaa.gov/snow/plot_SWE.php?id=AFSW1

  • Extra credit: add shaded regions for standard deviation or normalized median absolute deviation (nmad) for each doy to show spread in values over the full record

Add the daily snow depth values for the current water year

  • Can use pandas indexing here with simple strings (‘YYYY-MM-DD’), or Timestamp objects

    • Standard slicing also works with :

  • Make sure to dropna to remove any records missing data

  • Add this to your plot

For most recent snow depth value (today or yesterday), what is the percentage of “normal” snow depth (as defined by long-term median for this dowy) at this site?

Part 3: Retrieve time series records for All SNOTEL Sites

  • Now that we’ve explored one site, let’s look at them all!

  • Probably some interesting spatial/temporal variability in these metrics

  • I’ve providing the following code to do this for you. Please review so you understand what’s going on:

    • Loop through all sites names in the WA GeoDataFrame and run the above fetch function

    • Store output in a dictionary

    • Convert the dictionary to a Pandas Dataframe

  • Note that this could take some time to run for all SNOTEL sites (~10 min, depending on server load)

    • Progress will be printed out incrementally

    • Several sites may return an error (e.g., <suds.sax.document.Document object at 0x7f0813040730>), but this is handled by the try-except block in the fetch function above

    • While you wait, review the remainder of the lab

Save the DataFrame, so you don’t have to download again

  • Can use a number of different formats here, easiest to use a “pickle”: https://www.pythoncentral.io/how-to-pickle-unpickle-tutorial/

  • Define a unique filename for the dataset (e.g., pkl_fn='snotel_wa_snwd_d.pkl')

  • Write the DataFrame to disk

  • Read it to a temporary variable and verify that everything looks good

#pkl_fn = variablecode.replace(':','_')+'_'+start_date+'_'+end_date+'.pkl'
#All SNOTEL sites
pkl_fn = 'snotel_snwd_d.pkl'
gdf = sites_gdf_all

#Isolated to WA SNOTEL sites
#pkl_fn = 'snotel_snwd_d_wa.pkl'
#gdf = sites_gdf_wa
if os.path.exists(pkl_fn):
    snwd_df = pd.read_pickle(pkl_fn)
else:
    #Define an empty dictionary to store returns for each site
    value_dict = {}
    for i, sitecode in enumerate(gdf.index):
        print('%i of %i sites' % (i+1, len(gdf.index)) )
        #out = fetch(sitecode, variablecode, start_date, end_date)
        out = fetch(sitecode, variablecode)
        if out is not None:
            value_dict[sitecode] = out['value']
    #Convert the dictionary to a DataFrame, automatically handles different datetime ranges (nice!)
    snwd_df = pd.DataFrame.from_dict(value_dict)
    #Write out
    print(f"Writing out: {pkl_fn}")
    snwd_df.to_pickle(pkl_fn)
1 of 865 sites
2 of 865 sites
3 of 865 sites
4 of 865 sites
5 of 865 sites
6 of 865 sites
7 of 865 sites
8 of 865 sites
9 of 865 sites
10 of 865 sites
<suds.sax.document.Document object at 0x7f7536152580>
Unable to fetch SNOTEL:SNWD_D
11 of 865 sites
12 of 865 sites
<suds.sax.document.Document object at 0x7f75889eeb20>
Unable to fetch SNOTEL:SNWD_D
13 of 865 sites
14 of 865 sites
15 of 865 sites
16 of 865 sites
17 of 865 sites
18 of 865 sites
19 of 865 sites
20 of 865 sites
21 of 865 sites
22 of 865 sites
23 of 865 sites
24 of 865 sites
25 of 865 sites
26 of 865 sites
27 of 865 sites
28 of 865 sites
29 of 865 sites
30 of 865 sites
31 of 865 sites
32 of 865 sites
33 of 865 sites
34 of 865 sites
35 of 865 sites
36 of 865 sites
37 of 865 sites
38 of 865 sites
39 of 865 sites
40 of 865 sites
41 of 865 sites
42 of 865 sites
43 of 865 sites
44 of 865 sites
45 of 865 sites
46 of 865 sites
47 of 865 sites
48 of 865 sites
49 of 865 sites
<suds.sax.document.Document object at 0x7f7545a321c0>
Unable to fetch SNOTEL:SNWD_D
50 of 865 sites
51 of 865 sites
52 of 865 sites
53 of 865 sites
54 of 865 sites
55 of 865 sites
56 of 865 sites
57 of 865 sites
58 of 865 sites
59 of 865 sites
60 of 865 sites
61 of 865 sites
62 of 865 sites
63 of 865 sites
64 of 865 sites
65 of 865 sites
66 of 865 sites
67 of 865 sites
68 of 865 sites
69 of 865 sites
70 of 865 sites
71 of 865 sites
72 of 865 sites
73 of 865 sites
74 of 865 sites
75 of 865 sites
76 of 865 sites
77 of 865 sites
78 of 865 sites
79 of 865 sites
80 of 865 sites
81 of 865 sites
82 of 865 sites
83 of 865 sites
84 of 865 sites
85 of 865 sites
86 of 865 sites
87 of 865 sites
88 of 865 sites
89 of 865 sites
90 of 865 sites
91 of 865 sites
92 of 865 sites
93 of 865 sites
94 of 865 sites
<suds.sax.document.Document object at 0x7f7545a0df70>
Unable to fetch SNOTEL:SNWD_D
95 of 865 sites
96 of 865 sites
97 of 865 sites
98 of 865 sites
99 of 865 sites
100 of 865 sites
101 of 865 sites
102 of 865 sites
103 of 865 sites
104 of 865 sites
<suds.sax.document.Document object at 0x7f7536102370>
Unable to fetch SNOTEL:SNWD_D
105 of 865 sites
106 of 865 sites
107 of 865 sites
108 of 865 sites
109 of 865 sites
110 of 865 sites
111 of 865 sites
112 of 865 sites
113 of 865 sites
114 of 865 sites
115 of 865 sites
116 of 865 sites
117 of 865 sites
118 of 865 sites
119 of 865 sites
120 of 865 sites
121 of 865 sites
122 of 865 sites
123 of 865 sites
124 of 865 sites
125 of 865 sites
126 of 865 sites
127 of 865 sites
128 of 865 sites
129 of 865 sites
130 of 865 sites
131 of 865 sites
132 of 865 sites
<suds.sax.document.Document object at 0x7f753616da00>
Unable to fetch SNOTEL:SNWD_D
133 of 865 sites
134 of 865 sites
135 of 865 sites
136 of 865 sites
<suds.sax.document.Document object at 0x7f75361373d0>
Unable to fetch SNOTEL:SNWD_D
137 of 865 sites
138 of 865 sites
139 of 865 sites
140 of 865 sites
141 of 865 sites
142 of 865 sites
143 of 865 sites
144 of 865 sites
145 of 865 sites
146 of 865 sites
147 of 865 sites
148 of 865 sites
<suds.sax.document.Document object at 0x7f75458c1190>
Unable to fetch SNOTEL:SNWD_D
149 of 865 sites
150 of 865 sites
151 of 865 sites
152 of 865 sites
153 of 865 sites
154 of 865 sites
155 of 865 sites
156 of 865 sites
157 of 865 sites
158 of 865 sites
159 of 865 sites
160 of 865 sites
161 of 865 sites
162 of 865 sites
163 of 865 sites
164 of 865 sites
165 of 865 sites
166 of 865 sites
167 of 865 sites
168 of 865 sites
169 of 865 sites
170 of 865 sites
171 of 865 sites
172 of 865 sites
173 of 865 sites
174 of 865 sites
175 of 865 sites
176 of 865 sites
177 of 865 sites
178 of 865 sites
179 of 865 sites
<suds.sax.document.Document object at 0x7f753611eb50>
Unable to fetch SNOTEL:SNWD_D
180 of 865 sites
181 of 865 sites
182 of 865 sites
183 of 865 sites
184 of 865 sites
185 of 865 sites
186 of 865 sites
187 of 865 sites
188 of 865 sites
189 of 865 sites
190 of 865 sites
191 of 865 sites
192 of 865 sites
193 of 865 sites
194 of 865 sites
195 of 865 sites
196 of 865 sites
197 of 865 sites
198 of 865 sites
199 of 865 sites
200 of 865 sites
201 of 865 sites
202 of 865 sites
203 of 865 sites
204 of 865 sites
<suds.sax.document.Document object at 0x7f75361a7ee0>
Unable to fetch SNOTEL:SNWD_D
205 of 865 sites
206 of 865 sites
207 of 865 sites
208 of 865 sites
<suds.sax.document.Document object at 0x7f753611ef70>
Unable to fetch SNOTEL:SNWD_D
209 of 865 sites
210 of 865 sites
211 of 865 sites
212 of 865 sites
213 of 865 sites
214 of 865 sites
215 of 865 sites
216 of 865 sites
217 of 865 sites
218 of 865 sites
219 of 865 sites
220 of 865 sites
221 of 865 sites
222 of 865 sites
223 of 865 sites
224 of 865 sites
225 of 865 sites
226 of 865 sites
227 of 865 sites
228 of 865 sites
229 of 865 sites
230 of 865 sites
231 of 865 sites
232 of 865 sites
233 of 865 sites
234 of 865 sites
235 of 865 sites
236 of 865 sites
237 of 865 sites
238 of 865 sites
239 of 865 sites
240 of 865 sites
241 of 865 sites
242 of 865 sites
243 of 865 sites
244 of 865 sites
245 of 865 sites
<suds.sax.document.Document object at 0x7f7536102af0>
Unable to fetch SNOTEL:SNWD_D
246 of 865 sites
<suds.sax.document.Document object at 0x7f7536158100>
Unable to fetch SNOTEL:SNWD_D
247 of 865 sites
<suds.sax.document.Document object at 0x7f75459d0340>
Unable to fetch SNOTEL:SNWD_D
248 of 865 sites
<suds.sax.document.Document object at 0x7f75459d0c10>
Unable to fetch SNOTEL:SNWD_D
249 of 865 sites
<suds.sax.document.Document object at 0x7f753611ea60>
Unable to fetch SNOTEL:SNWD_D
250 of 865 sites
<suds.sax.document.Document object at 0x7f753611e850>
Unable to fetch SNOTEL:SNWD_D
251 of 865 sites
<suds.sax.document.Document object at 0x7f75459d05e0>
Unable to fetch SNOTEL:SNWD_D
252 of 865 sites
<suds.sax.document.Document object at 0x7f7535c843d0>
Unable to fetch SNOTEL:SNWD_D
253 of 865 sites
<suds.sax.document.Document object at 0x7f75361858e0>
Unable to fetch SNOTEL:SNWD_D
254 of 865 sites
<suds.sax.document.Document object at 0x7f753611ef10>
Unable to fetch SNOTEL:SNWD_D
255 of 865 sites
<suds.sax.document.Document object at 0x7f7535c1ffd0>
Unable to fetch SNOTEL:SNWD_D
256 of 865 sites
<suds.sax.document.Document object at 0x7f7535aedcd0>
Unable to fetch SNOTEL:SNWD_D
257 of 865 sites
<suds.sax.document.Document object at 0x7f7535aed1c0>
Unable to fetch SNOTEL:SNWD_D
258 of 865 sites
<suds.sax.document.Document object at 0x7f75457778b0>
Unable to fetch SNOTEL:SNWD_D
259 of 865 sites
<suds.sax.document.Document object at 0x7f7536102d60>
Unable to fetch SNOTEL:SNWD_D
260 of 865 sites
<suds.sax.document.Document object at 0x7f7535ac7100>
Unable to fetch SNOTEL:SNWD_D
261 of 865 sites
262 of 865 sites
263 of 865 sites
264 of 865 sites
265 of 865 sites
266 of 865 sites
267 of 865 sites
268 of 865 sites
269 of 865 sites
270 of 865 sites
271 of 865 sites
272 of 865 sites
273 of 865 sites
274 of 865 sites
275 of 865 sites
276 of 865 sites
277 of 865 sites
278 of 865 sites
279 of 865 sites
280 of 865 sites
281 of 865 sites
282 of 865 sites
283 of 865 sites
284 of 865 sites
285 of 865 sites
286 of 865 sites
287 of 865 sites
288 of 865 sites
289 of 865 sites
290 of 865 sites
291 of 865 sites
292 of 865 sites
293 of 865 sites
294 of 865 sites
295 of 865 sites
296 of 865 sites
297 of 865 sites
298 of 865 sites
299 of 865 sites
<suds.sax.document.Document object at 0x7f75361a75e0>
Unable to fetch SNOTEL:SNWD_D
300 of 865 sites
301 of 865 sites
302 of 865 sites
303 of 865 sites
304 of 865 sites
305 of 865 sites
306 of 865 sites
307 of 865 sites
308 of 865 sites
309 of 865 sites
310 of 865 sites
311 of 865 sites
312 of 865 sites
313 of 865 sites
314 of 865 sites
315 of 865 sites
316 of 865 sites
317 of 865 sites
318 of 865 sites
319 of 865 sites
320 of 865 sites
321 of 865 sites
322 of 865 sites
323 of 865 sites
<suds.sax.document.Document object at 0x7f75361021f0>
Unable to fetch SNOTEL:SNWD_D
324 of 865 sites
325 of 865 sites
326 of 865 sites
327 of 865 sites
328 of 865 sites
329 of 865 sites
330 of 865 sites
331 of 865 sites
332 of 865 sites
333 of 865 sites
334 of 865 sites
335 of 865 sites
336 of 865 sites
337 of 865 sites
338 of 865 sites
339 of 865 sites
340 of 865 sites
341 of 865 sites
342 of 865 sites
343 of 865 sites
344 of 865 sites
345 of 865 sites
346 of 865 sites
347 of 865 sites
348 of 865 sites
349 of 865 sites
350 of 865 sites
351 of 865 sites
352 of 865 sites
353 of 865 sites
354 of 865 sites
355 of 865 sites
356 of 865 sites
357 of 865 sites
358 of 865 sites
359 of 865 sites
360 of 865 sites
361 of 865 sites
362 of 865 sites
363 of 865 sites
<suds.sax.document.Document object at 0x7f7545afa670>
Unable to fetch SNOTEL:SNWD_D
364 of 865 sites
365 of 865 sites
366 of 865 sites
367 of 865 sites
368 of 865 sites
369 of 865 sites
370 of 865 sites
371 of 865 sites
372 of 865 sites
373 of 865 sites
<suds.sax.document.Document object at 0x7f7545a93b20>
Unable to fetch SNOTEL:SNWD_D
374 of 865 sites
375 of 865 sites
<suds.sax.document.Document object at 0x7f7545afa250>
Unable to fetch SNOTEL:SNWD_D
376 of 865 sites
377 of 865 sites
378 of 865 sites
379 of 865 sites
380 of 865 sites
381 of 865 sites
382 of 865 sites
383 of 865 sites
384 of 865 sites
385 of 865 sites
386 of 865 sites
387 of 865 sites
388 of 865 sites
389 of 865 sites
390 of 865 sites
391 of 865 sites
392 of 865 sites
393 of 865 sites
394 of 865 sites
395 of 865 sites
396 of 865 sites
397 of 865 sites
398 of 865 sites
399 of 865 sites
400 of 865 sites
401 of 865 sites
402 of 865 sites
403 of 865 sites
404 of 865 sites
405 of 865 sites
406 of 865 sites
407 of 865 sites
408 of 865 sites
409 of 865 sites
410 of 865 sites
411 of 865 sites
412 of 865 sites
413 of 865 sites
414 of 865 sites
415 of 865 sites
416 of 865 sites
417 of 865 sites
418 of 865 sites
419 of 865 sites
420 of 865 sites
421 of 865 sites
<suds.sax.document.Document object at 0x7f7588845f10>
Unable to fetch SNOTEL:SNWD_D
422 of 865 sites
423 of 865 sites
424 of 865 sites
425 of 865 sites
426 of 865 sites
427 of 865 sites
428 of 865 sites
429 of 865 sites
430 of 865 sites
431 of 865 sites
432 of 865 sites
433 of 865 sites
434 of 865 sites
435 of 865 sites
436 of 865 sites
437 of 865 sites
438 of 865 sites
439 of 865 sites
440 of 865 sites
441 of 865 sites
442 of 865 sites
443 of 865 sites
444 of 865 sites
445 of 865 sites
446 of 865 sites
447 of 865 sites
<suds.sax.document.Document object at 0x7f75888f31c0>
Unable to fetch SNOTEL:SNWD_D
448 of 865 sites
449 of 865 sites
450 of 865 sites
451 of 865 sites
452 of 865 sites
453 of 865 sites
454 of 865 sites
455 of 865 sites
456 of 865 sites
457 of 865 sites
458 of 865 sites
459 of 865 sites
460 of 865 sites
461 of 865 sites
462 of 865 sites
463 of 865 sites
464 of 865 sites
<suds.sax.document.Document object at 0x7f75888ae3d0>
Unable to fetch SNOTEL:SNWD_D
465 of 865 sites
466 of 865 sites
467 of 865 sites
468 of 865 sites
469 of 865 sites
470 of 865 sites
471 of 865 sites
472 of 865 sites
473 of 865 sites
<suds.sax.document.Document object at 0x7f7588a332b0>
Unable to fetch SNOTEL:SNWD_D
474 of 865 sites
475 of 865 sites
476 of 865 sites
477 of 865 sites
478 of 865 sites
479 of 865 sites
480 of 865 sites
481 of 865 sites
482 of 865 sites
483 of 865 sites
484 of 865 sites
485 of 865 sites
486 of 865 sites
487 of 865 sites
488 of 865 sites
489 of 865 sites
490 of 865 sites
491 of 865 sites
492 of 865 sites
493 of 865 sites
494 of 865 sites
495 of 865 sites
496 of 865 sites
497 of 865 sites
498 of 865 sites
<suds.sax.document.Document object at 0x7f758889ce80>
Unable to fetch SNOTEL:SNWD_D
499 of 865 sites
500 of 865 sites
<suds.sax.document.Document object at 0x7f7588a33700>
Unable to fetch SNOTEL:SNWD_D
501 of 865 sites
502 of 865 sites
503 of 865 sites
504 of 865 sites
505 of 865 sites
506 of 865 sites
507 of 865 sites
508 of 865 sites
509 of 865 sites
510 of 865 sites
511 of 865 sites
512 of 865 sites
513 of 865 sites
514 of 865 sites
515 of 865 sites
516 of 865 sites
517 of 865 sites
518 of 865 sites
519 of 865 sites
520 of 865 sites
521 of 865 sites
522 of 865 sites
523 of 865 sites
524 of 865 sites
525 of 865 sites
526 of 865 sites
527 of 865 sites
528 of 865 sites
529 of 865 sites
530 of 865 sites
<suds.sax.document.Document object at 0x7f75889197f0>
Unable to fetch SNOTEL:SNWD_D
531 of 865 sites
532 of 865 sites
533 of 865 sites
534 of 865 sites
535 of 865 sites
536 of 865 sites
537 of 865 sites
538 of 865 sites
539 of 865 sites
<suds.sax.document.Document object at 0x7f7535b07a00>
Unable to fetch SNOTEL:SNWD_D
540 of 865 sites
<suds.sax.document.Document object at 0x7f7588980580>
Unable to fetch SNOTEL:SNWD_D
541 of 865 sites
542 of 865 sites
543 of 865 sites
544 of 865 sites
545 of 865 sites
546 of 865 sites
547 of 865 sites
548 of 865 sites
549 of 865 sites
550 of 865 sites
551 of 865 sites
552 of 865 sites
553 of 865 sites
554 of 865 sites
555 of 865 sites
<suds.sax.document.Document object at 0x7f75889c25e0>
Unable to fetch SNOTEL:SNWD_D
556 of 865 sites
557 of 865 sites
<suds.sax.document.Document object at 0x7f7588a33d00>
Unable to fetch SNOTEL:SNWD_D
558 of 865 sites
559 of 865 sites
560 of 865 sites
561 of 865 sites
562 of 865 sites
563 of 865 sites
564 of 865 sites
565 of 865 sites
566 of 865 sites
567 of 865 sites
568 of 865 sites
<suds.sax.document.Document object at 0x7f758893cb50>
Unable to fetch SNOTEL:SNWD_D
569 of 865 sites
570 of 865 sites
571 of 865 sites
572 of 865 sites
573 of 865 sites
574 of 865 sites
575 of 865 sites
576 of 865 sites
577 of 865 sites
<suds.sax.document.Document object at 0x7f7588a33100>
Unable to fetch SNOTEL:SNWD_D
578 of 865 sites
579 of 865 sites
580 of 865 sites
581 of 865 sites
582 of 865 sites
583 of 865 sites
584 of 865 sites
585 of 865 sites
586 of 865 sites
587 of 865 sites
588 of 865 sites
589 of 865 sites
590 of 865 sites
591 of 865 sites
592 of 865 sites
593 of 865 sites
<suds.sax.document.Document object at 0x7f7588a33490>
Unable to fetch SNOTEL:SNWD_D
594 of 865 sites
595 of 865 sites
596 of 865 sites
597 of 865 sites
598 of 865 sites
599 of 865 sites
600 of 865 sites
601 of 865 sites
602 of 865 sites
603 of 865 sites
604 of 865 sites
605 of 865 sites
606 of 865 sites
607 of 865 sites
608 of 865 sites
609 of 865 sites
610 of 865 sites
611 of 865 sites
612 of 865 sites
613 of 865 sites
614 of 865 sites
615 of 865 sites
616 of 865 sites
617 of 865 sites
618 of 865 sites
619 of 865 sites
620 of 865 sites
621 of 865 sites
622 of 865 sites
623 of 865 sites
624 of 865 sites
625 of 865 sites
626 of 865 sites
627 of 865 sites
628 of 865 sites
629 of 865 sites
630 of 865 sites
631 of 865 sites
632 of 865 sites
633 of 865 sites
634 of 865 sites
635 of 865 sites
636 of 865 sites
637 of 865 sites
638 of 865 sites
639 of 865 sites
640 of 865 sites
641 of 865 sites
642 of 865 sites
643 of 865 sites
644 of 865 sites
645 of 865 sites
646 of 865 sites
647 of 865 sites
648 of 865 sites
649 of 865 sites
650 of 865 sites
651 of 865 sites
652 of 865 sites
653 of 865 sites
654 of 865 sites
655 of 865 sites
656 of 865 sites
657 of 865 sites
658 of 865 sites
659 of 865 sites
660 of 865 sites
661 of 865 sites
<suds.sax.document.Document object at 0x7f7535b2f130>
Unable to fetch SNOTEL:SNWD_D
662 of 865 sites
663 of 865 sites
664 of 865 sites
665 of 865 sites
666 of 865 sites
667 of 865 sites
668 of 865 sites
669 of 865 sites
670 of 865 sites
671 of 865 sites
672 of 865 sites
673 of 865 sites
674 of 865 sites
675 of 865 sites
676 of 865 sites
677 of 865 sites
678 of 865 sites
679 of 865 sites
680 of 865 sites
681 of 865 sites
682 of 865 sites
683 of 865 sites
684 of 865 sites
685 of 865 sites
<suds.sax.document.Document object at 0x7f75358ef700>
Unable to fetch SNOTEL:SNWD_D
686 of 865 sites
687 of 865 sites
688 of 865 sites
689 of 865 sites
690 of 865 sites
<suds.sax.document.Document object at 0x7f7588815ca0>
Unable to fetch SNOTEL:SNWD_D
691 of 865 sites
692 of 865 sites
693 of 865 sites
694 of 865 sites
695 of 865 sites
696 of 865 sites
697 of 865 sites
698 of 865 sites
699 of 865 sites
700 of 865 sites
701 of 865 sites
702 of 865 sites
703 of 865 sites
704 of 865 sites
705 of 865 sites
706 of 865 sites
707 of 865 sites
708 of 865 sites
709 of 865 sites
710 of 865 sites
711 of 865 sites
712 of 865 sites
713 of 865 sites
714 of 865 sites
715 of 865 sites
716 of 865 sites
717 of 865 sites
718 of 865 sites
719 of 865 sites
720 of 865 sites
721 of 865 sites
722 of 865 sites
723 of 865 sites
724 of 865 sites
<suds.sax.document.Document object at 0x7f7535a97970>
Unable to fetch SNOTEL:SNWD_D
725 of 865 sites
726 of 865 sites
727 of 865 sites
728 of 865 sites
729 of 865 sites
730 of 865 sites
731 of 865 sites
732 of 865 sites
733 of 865 sites
734 of 865 sites
<suds.sax.document.Document object at 0x7f7535a12d00>
Unable to fetch SNOTEL:SNWD_D
735 of 865 sites
736 of 865 sites
737 of 865 sites
738 of 865 sites
<suds.sax.document.Document object at 0x7f7535c4a490>
Unable to fetch SNOTEL:SNWD_D
739 of 865 sites
740 of 865 sites
741 of 865 sites
742 of 865 sites
743 of 865 sites
<suds.sax.document.Document object at 0x7f7535a12880>
Unable to fetch SNOTEL:SNWD_D
744 of 865 sites
745 of 865 sites
746 of 865 sites
747 of 865 sites
748 of 865 sites
749 of 865 sites
750 of 865 sites
751 of 865 sites
752 of 865 sites
753 of 865 sites
754 of 865 sites
755 of 865 sites
756 of 865 sites
757 of 865 sites
758 of 865 sites
759 of 865 sites
760 of 865 sites
761 of 865 sites
762 of 865 sites
763 of 865 sites
764 of 865 sites
765 of 865 sites
766 of 865 sites
<suds.sax.document.Document object at 0x7f7535b9a730>
Unable to fetch SNOTEL:SNWD_D
767 of 865 sites
768 of 865 sites
769 of 865 sites
770 of 865 sites
771 of 865 sites
772 of 865 sites
773 of 865 sites
774 of 865 sites
775 of 865 sites
776 of 865 sites
777 of 865 sites
778 of 865 sites
779 of 865 sites
780 of 865 sites
781 of 865 sites
782 of 865 sites
783 of 865 sites
784 of 865 sites
785 of 865 sites
786 of 865 sites
787 of 865 sites
788 of 865 sites
789 of 865 sites
790 of 865 sites
791 of 865 sites
792 of 865 sites
793 of 865 sites
794 of 865 sites
795 of 865 sites
796 of 865 sites
797 of 865 sites
798 of 865 sites
799 of 865 sites
<suds.sax.document.Document object at 0x7f7535c97340>
Unable to fetch SNOTEL:SNWD_D
800 of 865 sites
801 of 865 sites
802 of 865 sites
803 of 865 sites
804 of 865 sites
805 of 865 sites
806 of 865 sites
807 of 865 sites
808 of 865 sites
809 of 865 sites
810 of 865 sites
811 of 865 sites
<suds.sax.document.Document object at 0x7f7535ca4e80>
Unable to fetch SNOTEL:SNWD_D
812 of 865 sites
813 of 865 sites
814 of 865 sites
815 of 865 sites
816 of 865 sites
817 of 865 sites
818 of 865 sites
819 of 865 sites
820 of 865 sites
821 of 865 sites
822 of 865 sites
<suds.sax.document.Document object at 0x7f753587b7f0>
Unable to fetch SNOTEL:SNWD_D
823 of 865 sites
824 of 865 sites
825 of 865 sites
826 of 865 sites
827 of 865 sites
828 of 865 sites
<suds.sax.document.Document object at 0x7f75889e1790>
Unable to fetch SNOTEL:SNWD_D
829 of 865 sites
830 of 865 sites
831 of 865 sites
832 of 865 sites
833 of 865 sites
834 of 865 sites
835 of 865 sites
836 of 865 sites
837 of 865 sites
<suds.sax.document.Document object at 0x7f7535a304f0>
Unable to fetch SNOTEL:SNWD_D
838 of 865 sites
839 of 865 sites
840 of 865 sites
841 of 865 sites
842 of 865 sites
843 of 865 sites
844 of 865 sites
845 of 865 sites
846 of 865 sites
847 of 865 sites
848 of 865 sites
849 of 865 sites
850 of 865 sites
851 of 865 sites
852 of 865 sites
853 of 865 sites
854 of 865 sites
855 of 865 sites
856 of 865 sites
857 of 865 sites
858 of 865 sites
859 of 865 sites
860 of 865 sites
861 of 865 sites
862 of 865 sites
863 of 865 sites
864 of 865 sites
865 of 865 sites
Writing out: snotel_snwd_d.pkl

Inspect the DataFrame

  • Note structure, number of timestamps, NaNs

  • What is the date of the first snow depth measurement in the network?

    • Note that the water equivalent (WTEQ_D) measurements from snow pillows extend much farther back in time, back to the early 1980s, before the ultrasonic snow depth instruments were incorporated across the network. These are better to use for long-term studies.

snwd_df.shape
(13298, 806)
snwd_df.head()
SNOTEL:301_CA_SNTL SNOTEL:907_UT_SNTL SNOTEL:916_MT_SNTL SNOTEL:908_WA_SNTL SNOTEL:302_OR_SNTL SNOTEL:1000_OR_SNTL SNOTEL:303_CO_SNTL SNOTEL:1030_CO_SNTL SNOTEL:304_OR_SNTL SNOTEL:306_ID_SNTL ... SNOTEL:872_WY_SNTL SNOTEL:873_OR_SNTL SNOTEL:874_CO_SNTL SNOTEL:875_WY_SNTL SNOTEL:876_MT_SNTL SNOTEL:877_AZ_SNTL SNOTEL:1228_UT_SNTL SNOTEL:1197_UT_SNTL SNOTEL:878_WY_SNTL SNOTEL:1033_CO_SNTL
datetime
1984-10-01 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1984-10-02 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1984-10-03 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1984-10-04 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1984-10-05 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 806 columns

snwd_df.describe()
SNOTEL:301_CA_SNTL SNOTEL:907_UT_SNTL SNOTEL:916_MT_SNTL SNOTEL:908_WA_SNTL SNOTEL:302_OR_SNTL SNOTEL:1000_OR_SNTL SNOTEL:303_CO_SNTL SNOTEL:1030_CO_SNTL SNOTEL:304_OR_SNTL SNOTEL:306_ID_SNTL ... SNOTEL:872_WY_SNTL SNOTEL:873_OR_SNTL SNOTEL:874_CO_SNTL SNOTEL:875_WY_SNTL SNOTEL:876_MT_SNTL SNOTEL:877_AZ_SNTL SNOTEL:1228_UT_SNTL SNOTEL:1197_UT_SNTL SNOTEL:878_WY_SNTL SNOTEL:1033_CO_SNTL
count 8237.000000 8173.000000 8924.000000 6103.000000 7407.000000 7458.000000 7385.000000 6402.000000 6468.000000 8795.000000 ... 5614.000000 5725.000000 8284.000000 7128.000000 6797.000000 6185.000000 3127.00000 3143.000000 7244.000000 6090.000000
mean 9.743596 7.439741 24.902174 35.332951 27.337519 35.849558 6.377116 29.565917 15.311070 31.079022 ... 9.041503 14.762969 34.598262 11.130471 10.437840 3.924010 10.54621 8.742603 18.222667 31.388013
std 14.627869 12.829321 23.552421 42.432964 27.273839 40.686039 9.732130 27.805866 19.932041 34.176451 ... 11.332270 19.291954 36.224335 14.395877 12.986167 7.871505 14.16615 12.111207 19.911463 34.733819
min 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000
25% 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 ... 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.00000 0.000000 0.000000 0.000000
50% 0.000000 0.000000 22.000000 15.000000 21.000000 20.000000 1.000000 24.000000 0.000000 17.000000 ... 3.000000 0.000000 26.000000 1.000000 4.000000 0.000000 0.00000 0.000000 10.000000 22.000000
75% 17.000000 12.000000 44.000000 67.000000 49.000000 68.000000 10.000000 54.000000 31.250000 59.000000 ... 17.000000 30.000000 62.000000 22.000000 19.000000 3.000000 22.00000 18.000000 34.000000 55.000000
max 66.000000 70.000000 95.000000 175.000000 107.000000 162.000000 68.000000 106.000000 76.000000 174.000000 ... 139.000000 75.000000 198.000000 64.000000 61.000000 56.000000 60.00000 55.000000 79.000000 169.000000

8 rows × 806 columns

Convert snow depth inches to cm

  • Use these values for the remainder of the lab

Create a histogram of all snow depth values

  • Use the Pandas stack function here, otherwise you will end up with histograms for each station

  • Consider using log scale, as you likely have a spike for days with 0 snow depth

f, ax = plt.subplots()
snwd_df.stack().hist(bins=128, ax=ax, log=True);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_110_0.png

Get the total count of operational stations on each day and plot over time

  • Thought question Can you identify years where a large number of new snow depth sensors were added to the network?

Part 4: Consider temporal correlation of snow depth for nearby stations

#Highest site
#site1 = 'SNOTEL:863_WA_SNTL'
#site2 = 'SNOTEL:692_WA_SNTL'
#Paradise and nearby sites
site1 = 'SNOTEL:679_WA_SNTL'
site2 = 'SNOTEL:1085_WA_SNTL'
site3 = 'SNOTEL:1257_WA_SNTL'
site4 = 'SNOTEL:941_WA_SNTL'
site5 = 'SNOTEL:642_WA_SNTL'

Start with two nearby stations

site_list = [site1,site2]
#Use corresponding colors in line and location scatterplots
color_list = ['C%i' % i for i in range(len(site_list))]
snwd_df[site_list]
SNOTEL:679_WA_SNTL SNOTEL:1085_WA_SNTL
datetime
1984-10-01 00:00:00+00:00 NaN NaN
1984-10-02 00:00:00+00:00 NaN NaN
1984-10-03 00:00:00+00:00 NaN NaN
1984-10-04 00:00:00+00:00 NaN NaN
1984-10-05 00:00:00+00:00 NaN NaN
... ... ...
2021-02-13 00:00:00+00:00 360.68 327.66
2021-02-14 00:00:00+00:00 370.84 342.90
2021-02-15 00:00:00+00:00 383.54 363.22
2021-02-16 00:00:00+00:00 421.64 393.70
2021-02-17 00:00:00+00:00 424.18 388.62

13289 rows × 2 columns

Limit to records where both have valid data (drop NaN)

snwd_df[site_list].dropna(thresh=2)
SNOTEL:679_WA_SNTL SNOTEL:1085_WA_SNTL
datetime
2006-10-05 00:00:00+00:00 0.00 0.00
2006-10-06 00:00:00+00:00 0.00 0.00
2006-10-07 00:00:00+00:00 0.00 0.00
2006-10-08 00:00:00+00:00 0.00 0.00
2006-10-09 00:00:00+00:00 0.00 0.00
... ... ...
2021-02-13 00:00:00+00:00 360.68 327.66
2021-02-14 00:00:00+00:00 370.84 342.90
2021-02-15 00:00:00+00:00 383.54 363.22
2021-02-16 00:00:00+00:00 421.64 393.70
2021-02-17 00:00:00+00:00 424.18 388.62

5089 rows × 2 columns

Plot location and time series

f, axa = plt.subplots(1,2,figsize=(15,5))
sites_gdf_wa.loc[site_list].plot(facecolor=color_list, edgecolor='k', ax=axa[0])
ctx.add_basemap(ax=axa[0], crs=sites_gdf_wa.crs, source=ctx.providers.Stamen.Terrain, alpha=0.7)
snwd_df[site_list].dropna(thresh=2).plot(ax=axa[1])
plt.tight_layout()
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_122_0.png

Consider seasonal variability in snow depth evolution for the two sites

  • Create a scatterplot showing snow depth from site 1 on the y axis and site 2 on the x axis, with color ramp representing DOWY

  • The points should fall on the 1:1 line if the snow depth evolution was identical

#Add column for dowy
add_dowy(snwd_df)
snwd_df[[*site_list,'dowy']].dropna(thresh=2)
SNOTEL:679_WA_SNTL SNOTEL:1085_WA_SNTL dowy
datetime
2006-08-18 00:00:00+00:00 0.00 NaN 322
2006-08-19 00:00:00+00:00 0.00 NaN 323
2006-08-20 00:00:00+00:00 0.00 NaN 324
2006-08-21 00:00:00+00:00 0.00 NaN 325
2006-08-22 00:00:00+00:00 0.00 NaN 326
... ... ... ...
2021-02-13 00:00:00+00:00 360.68 327.66 136
2021-02-14 00:00:00+00:00 370.84 342.90 137
2021-02-15 00:00:00+00:00 383.54 363.22 138
2021-02-16 00:00:00+00:00 421.64 393.70 139
2021-02-17 00:00:00+00:00 424.18 388.62 140

5298 rows × 3 columns

max_snwd = int(np.ceil(snwd_df[[*site_list]].dropna(thresh=2).max().max()))
f,ax = plt.subplots(figsize=(8,8))
ax.set_aspect('equal')
snwd_df[[*site_list,'dowy']].dropna(thresh=2).plot.scatter(x=site1,y=site2,c='dowy',cmap='inferno', s=10,ax=ax);
ax.plot(range(0,max_snwd), range(0,max_snwd), color='lightgreen', ls='--');
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_127_0.png

Looks like 679 has greater snow depth later in the season, compared to 1085

Determine Pearson’s correlation coefficient for the two time series

snwd_corr = snwd_df[site_list].corr()
snwd_corr
SNOTEL:679_WA_SNTL SNOTEL:1085_WA_SNTL
SNOTEL:679_WA_SNTL 1.000000 0.970401
SNOTEL:1085_WA_SNTL 0.970401 1.000000

As expected, these two records are highly correlated!

Now repeat for several nearby sites

site_list = [site1,site2,site3,site4,site5]
#Use corresponding colors in line and location scatterplots
color_list = ['C%i' % i for i in range(len(site_list))]
mygdf = sites_gdf_wa.loc[site_list]
mygdf
code name network elevation_m site_property geometry
SNOTEL:679_WA_SNTL 679_WA_SNTL Paradise SNOTEL 1563.624023 {'county': 'Pierce', 'state': 'Washington', 's... POINT (-570102.829 557709.249)
SNOTEL:1085_WA_SNTL 1085_WA_SNTL Cayuse Pass SNOTEL 1597.151978 {'county': 'Pierce', 'state': 'Washington', 's... POINT (-553060.628 565941.972)
SNOTEL:1257_WA_SNTL 1257_WA_SNTL Skate Creek SNOTEL 1149.095947 {'county': 'Lewis', 'state': 'Washington', 'si... POINT (-577759.243 542830.006)
SNOTEL:941_WA_SNTL 941_WA_SNTL Mowich SNOTEL 963.168030 {'county': 'Pierce', 'state': 'Washington', 's... POINT (-584219.828 575219.336)
SNOTEL:642_WA_SNTL 642_WA_SNTL Morse Lake SNOTEL 1648.968018 {'county': 'Yakima', 'state': 'Washington', 's... POINT (-548802.595 569633.965)
f, axa = plt.subplots(1,2,figsize=(15,5))
mygdf.plot(facecolor=color_list, edgecolor='k', ax=axa[0])
for x, y, label, c in zip(mygdf.geometry.x, mygdf.geometry.y, mygdf.code.str.split('_').str[0], color_list):
    axa[0].annotate(label, xy=(x,y), xytext=(0, -15), ha='center', textcoords="offset points", color=c, bbox=dict(boxstyle="square",fc='w',alpha=0.7))
ctx.add_basemap(ax=axa[0], crs=sites_gdf_wa.crs, source=ctx.providers.Stamen.Terrain, alpha=0.7)
snwd_df[site_list].dropna(thresh=2).plot(ax=axa[1])
plt.tight_layout()
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_135_0.png
snwd_df[site_list].dropna(how='all').plot();
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_136_0.png
#The Pandas `corr` should properly handle nans
snwd_corr = snwd_df[site_list].corr()
snwd_corr
SNOTEL:679_WA_SNTL SNOTEL:1085_WA_SNTL SNOTEL:1257_WA_SNTL SNOTEL:941_WA_SNTL SNOTEL:642_WA_SNTL
SNOTEL:679_WA_SNTL 1.000000 0.970401 0.903934 0.427085 0.954083
SNOTEL:1085_WA_SNTL 0.970401 1.000000 0.939357 0.418298 0.976739
SNOTEL:1257_WA_SNTL 0.903934 0.939357 1.000000 0.500938 0.905656
SNOTEL:941_WA_SNTL 0.427085 0.418298 0.500938 1.000000 0.404402
SNOTEL:642_WA_SNTL 0.954083 0.976739 0.905656 0.404402 1.000000

Visualize correlation values for different combinations of stations

#Not sure why 0.5 values are not identical color?
snwd_corr.style.background_gradient(cmap='inferno').set_precision(2)
SNOTEL:679_WA_SNTL SNOTEL:1085_WA_SNTL SNOTEL:1257_WA_SNTL SNOTEL:941_WA_SNTL SNOTEL:642_WA_SNTL
SNOTEL:679_WA_SNTL 1.00 0.97 0.90 0.43 0.95
SNOTEL:1085_WA_SNTL 0.97 1.00 0.94 0.42 0.98
SNOTEL:1257_WA_SNTL 0.90 0.94 1.00 0.50 0.91
SNOTEL:941_WA_SNTL 0.43 0.42 0.50 1.00 0.40
SNOTEL:642_WA_SNTL 0.95 0.98 0.91 0.40 1.00
import seaborn as sns
sns.heatmap(snwd_corr, cmap='RdBu', vmin=-1, vmax=1, square=True);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_140_0.png

Extra Credit: Consider spatial variability of these correlation coefficients

  • Select a reference station

  • Compute correlation scores with this reference station

    • Join the correlation score values with original GeoDataFrame containing Point geometries

  • Create a scatter plot (map) of correlation score

ref = site1
f, axa = plt.subplots(1,2,figsize=(15,5))
#Join the ref site correlation scores with the original geodataframe and plot
mygdf.join(snwd_corr[ref]).plot(ax=axa[0], column=ref, legend=True, markersize=60, edgecolor='k')
#sites_gdf_wa.loc[site_list].join(snwd_corr[ref]).plot(ax=ax, column=ref, legend=True, markersize=100, edgecolor=color_list, lw=2.0)
for x, y, label, c in zip(mygdf.geometry.x, mygdf.geometry.y, mygdf.code.str.split('_').str[0], color_list):
    axa[0].annotate(label, xy=(x,y), xytext=(0, -15), ha='center', textcoords="offset points", color=c, bbox=dict(boxstyle="square",fc='w',alpha=0.7))
#Plot the reference site
sites_gdf_wa.loc[[ref,ref]].plot(ax=axa[0], marker='+', markersize=200, color='r')
ctx.add_basemap(ax=axa[0], crs=sites_gdf_wa.crs, source=ctx.providers.Stamen.Terrain, alpha=0.7)
axa[0].set_title("Correlation with %s" % ref)
snwd_df[site_list].dropna(thresh=2).plot(ax=axa[1]);
axa[1].set_ylabel('Snow Depth (cm)')
plt.tight_layout()
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_143_0.png

Extra Credit: Consider the correlation as a function of distance from the reference station

  • Compute the distance in km between the reference station and all other stations

  • Create a plot of correlation coefficient vs. distance

Part 5: Extra Credit: Consider correlation for all SNOTEL sites across Western U.S.

  • Consider correlation vs. distance

  • Consider correlation vs. elevation (relative to ref station elevation)

# Scatterplot to consider both distance and elevation
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_150_0.png

Create a map to consider spatial variability in correlation

Part 6: Compute daily snow depth difference for all stations

  • This represents daily snow accumulation or ablation

  • See the Pandas diff function

    • Make sure you specify the axis properly to difference over time (not station to station), and sanity check

    • This should correctly account for missing values, but need to double-check

  • Sanity check with tail - most values should be relatively small (+/-1-6)

snwd_df_diff.head()
SNOTEL:301_CA_SNTL SNOTEL:907_UT_SNTL SNOTEL:916_MT_SNTL SNOTEL:908_WA_SNTL SNOTEL:302_OR_SNTL SNOTEL:1000_OR_SNTL SNOTEL:303_CO_SNTL SNOTEL:1030_CO_SNTL SNOTEL:304_OR_SNTL SNOTEL:306_ID_SNTL ... SNOTEL:874_CO_SNTL SNOTEL:875_WY_SNTL SNOTEL:876_MT_SNTL SNOTEL:877_AZ_SNTL SNOTEL:1228_UT_SNTL SNOTEL:1197_UT_SNTL SNOTEL:878_WY_SNTL SNOTEL:1033_CO_SNTL doy dowy
datetime
1984-10-01 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1984-10-02 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN 1.0 1.0
1984-10-03 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN 1.0 1.0
1984-10-04 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN 1.0 1.0
1984-10-05 00:00:00+00:00 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN 1.0 1.0

5 rows × 808 columns

Create plot showing this daily accumulation for all sites

  • Can start by trying to plot a subset of sites

    • May take ~1-2 min to plot all ~800 sites

  • Adjusting the ylim appropriately

  • Probably best to set legend=False in your plot call

  • Add a black horizontal line at 0 with linewidth=0.5

f,ax = plt.subplots(figsize=(12,5))
snwd_df_diff.plot(ax=ax,legend=False, lw=0.5)
ax.axhline(0, color='k', lw=0.5);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_157_0.png

Interpretation (Provide brief written response these questions)

  • Do you think you can confidently identify large snow accumulation events using these difference values?

  • Are some periods noisier than others?

  • When the snow depth increases from one day to the next, what happened?

  • When the snow depth decreases from one day to the next, what happened?

    • Hint: during the winter, some days never get above freezing, but snow depths still decrease…

Extra Credit: Design filters to remove artifacts and outliers

  • Could likely design a series of filters to remove outliers from original snow depth value time series and the difference time series for each site

  • One idea:

    • What is the maximum amount of snowfall you would expect in a 24-hour period?

    • How about the maximum decrease in snow depth in a 24-hour period?

  • Can also combine multiple stations for another round of filters:

    • If a single station shows an increase of 2 m, but all others show a decrease, is that realistic?

  • Other considerations:

    • mean +/- (3 * std) is often used to remove outliers from normally distributed values (but is this the case for your difference values?)

  • Maybe come back to this if you have time later, for now, just note the presence of measurement errors

f, ax = plt.subplots()
snwd_df_diff.stack().hist(bins=128, ax=ax, log=True);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_160_0.png
snwd_df_diff.count().sum()
4870357
#Mean and std for entire dataset before filtering
print(np.nanmean(snwd_df_diff.values), np.nanstd(snwd_df_diff.values))
0.008573285284836417 14.208349379818223
#Assume maximum daily snowfall of 1.7 m
max_diff = 170
idx = snwd_df_diff > max_diff
#Assume minimum daily decrease of 0.7 m
min_diff = -70
idx = idx | (snwd_df_diff < min_diff)
snwd_df_diff[idx] = np.nan
snwd_df_diff.count().sum()
4863720
f=3
#Mean and std for entire dataset
print(np.nanmean(snwd_df_diff.values), np.nanstd(snwd_df_diff.values))
0.02167078696964463 5.002396854299748
snwd_df_diff.mean().mean()
0.02183840322357014
snwd_df_diff.std().mean()
4.660904228013316
#Mean and std for each site
#idx = (snwd_df_diff - snwd_df_diff.mean()).abs().gt(f*snwd_df_diff.std())
#snwd_df_diff[idx] = np.nan
#Mean and std for each day
#idx = (snwd_df_diff.sub(snwd_df_diff.mean(axis=1), axis=0)).abs().gt(f*snwd_df_diff.std(axis=1))
#snwd_df_diff[idx] = np.nan
snwd_df_diff.median(axis=1).max()
/srv/conda/envs/notebook/lib/python3.8/site-packages/numpy/lib/nanfunctions.py:993: RuntimeWarning: All-NaN slice encountered
  result = np.apply_along_axis(_nanmedian1d, axis, a, overwrite_input)
15.240000000000009
snwd_df_diff.count().sum()
4863720
f,ax = plt.subplots(figsize=(12,5))
snwd_df_diff.plot(ax=ax,legend=False, lw=0.5)
ax.axhline(0, color='k', lw=0.5);
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_172_0.png

For now, let’s aggregate across all stations using robust statistics

  • Create a plot showing the median difference values across all stations for each day in the record (not doy, but all days with data)

    • Again, careful about which axis along which you’re computing the median

  • You may need to adjust the ylim to bring out detail if you haven’t removed outliers

Create a similar plot, but limit to records since Jan 1 of current year

  • Can you identify any big snow events?

  • I added shading to show spread of values (+/-nmad), but this is optional

Now plot just the WA stations (use the wa_stations index we already calculated)

  • Can do this with snwd_df_diff.loc[:,wa_stations]

    • Note: some stations we identified earlier may be missing from the larger time series data frame!

    • Can remove those as shown below

  • What date had the greatest daily snow accumulation across all stations in WA?

print(wa_stations.shape)
wa_stations = wa_stations[wa_stations.isin(snwd_df_diff.columns)]
print(wa_stations.shape)
(84,)
(75,)

Extra Credit: Repeat for WA high elevation sites (defined earlier)

Part 7: Compute statistics for all time series at all SNOTEL sites and evaluate spatial variability

  • So, now we’re going back to our GeoDataFrame containing Point geometry for all sites, and adding some key metrics

col_list = []

Get the count of valid daily records (not NaN) available for each station

  • Which station has the greatest number of observations for SNWD_D?

snwd_df.count(axis=0).sort_values().tail()
SNOTEL:570_NV_SNTL     8982
SNOTEL:651_OR_SNTL     9007
SNOTEL:347_MT_SNTL     9928
doy                   13289
dowy                  13289
dtype: int64

Add the count of valid time series records for each station a new column in our original WA sites GeoDataframe (the one containing lat/lon and other site attributes)

  • Should be straightforward, let Pandas do the work to match site index values

  • Verify your site DataFrame now has a count column

col = 'Total Observation Count (days)'
gdf[col] = snwd_df.count(axis=0)
col_list.append(col)
gdf.head()
code name network elevation_m site_property geometry Total Observation Count (days)
SNOTEL:301_CA_SNTL 301_CA_SNTL Adin Mtn SNOTEL 1886.712036 {'county': 'Modoc', 'state': 'California', 'si... POINT (-120.79192 41.23583) 8237.0
SNOTEL:907_UT_SNTL 907_UT_SNTL Agua Canyon SNOTEL 2712.719971 {'county': 'Kane', 'state': 'Utah', 'site_comm... POINT (-112.27118 37.52217) 8173.0
SNOTEL:916_MT_SNTL 916_MT_SNTL Albro Lake SNOTEL 2529.840088 {'county': 'Madison', 'state': 'Montana', 'sit... POINT (-111.95902 45.59723) 8924.0
SNOTEL:908_WA_SNTL 908_WA_SNTL Alpine Meadows SNOTEL 1066.800049 {'county': 'King', 'state': 'Washington', 'sit... POINT (-121.69847 47.77957) 6103.0
SNOTEL:302_OR_SNTL 302_OR_SNTL Aneroid Lake #2 SNOTEL 2255.520020 {'county': 'Wallowa', 'state': 'Oregon', 'site... POINT (-117.19258 45.21328) 7407.0

Add a column for the long-term max snow depth on record for each site

  • Might need to be careful about measurement errors here - maybe look at values and filter obvious outliers

  • Could also consider a threshold for minimum count of valid days

Add a column for the daily difference (snow accumulation) at all sites for Feb 13, 2021 (or other recent storm)

  • Note, Pandas uses a Timestamp object that wraps the commond Python datetime objects

  • pd.Timestamp(‘YYYY-MM-DD’)

  • Note, you may need to use mydataframe.loc[pd.Timestamp('YYYY-MM-DD')]

Add a column for the snow depth for a recent day (say, yesterday) in the time series

  • Use a relative index value here to get the latest Timestamp (like -1, or maybe -2)

    • Note that using timestamp for today could have limited returns, as latest data from some stations have not been integrated into database

  • Note that the index is not a string, it is a Pandas Timestamp object: pd.Timestamp('2019-02-06 00:00:00')

Compute the long-term median snow depth for this same doy at all sites

  • See earlier example, where we grouped by df.index.dayofyear, then aggregated

daily_med
SNOTEL:301_CA_SNTL SNOTEL:907_UT_SNTL SNOTEL:916_MT_SNTL SNOTEL:908_WA_SNTL SNOTEL:302_OR_SNTL SNOTEL:1000_OR_SNTL SNOTEL:303_CO_SNTL SNOTEL:1030_CO_SNTL SNOTEL:304_OR_SNTL SNOTEL:306_ID_SNTL ... SNOTEL:874_CO_SNTL SNOTEL:875_WY_SNTL SNOTEL:876_MT_SNTL SNOTEL:877_AZ_SNTL SNOTEL:1228_UT_SNTL SNOTEL:1197_UT_SNTL SNOTEL:878_WY_SNTL SNOTEL:1033_CO_SNTL doy dowy
datetime
1 53.34 35.56 83.82 104.14 86.36 137.16 29.21 101.60 83.82 132.08 ... 121.92 49.53 35.56 30.48 58.42 53.34 64.77 99.06 1 93
2 50.80 35.56 83.82 101.60 88.90 137.16 27.94 93.98 82.55 129.54 ... 127.00 52.07 35.56 29.21 58.42 48.26 66.04 101.60 2 94
3 53.34 35.56 81.28 101.60 85.09 137.16 27.94 105.41 78.74 128.27 ... 124.46 49.53 35.56 27.94 55.88 48.26 71.12 101.60 3 95
4 50.80 35.56 83.82 99.06 85.09 139.70 27.94 104.14 82.55 130.81 ... 119.38 55.88 40.64 25.40 58.42 48.26 69.85 106.68 4 96
5 58.42 40.64 83.82 101.60 86.36 154.94 30.48 104.14 81.28 135.89 ... 127.00 54.61 38.10 27.94 55.88 45.72 72.39 107.95 5 97
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
362 45.72 35.56 83.82 96.52 81.28 127.00 25.40 101.60 72.39 119.38 ... 118.11 44.45 33.02 17.78 60.96 50.80 57.15 96.52 362 89
363 45.72 33.02 81.28 93.98 90.17 142.24 26.67 101.60 72.39 129.54 ... 116.84 44.45 33.02 20.32 58.42 55.88 57.15 97.79 363 90
364 48.26 33.02 83.82 96.52 88.90 142.24 25.40 100.33 77.47 132.08 ... 127.00 45.72 35.56 19.05 58.42 50.80 62.23 99.06 364 91
365 50.80 30.48 83.82 99.06 88.90 144.78 27.94 101.60 82.55 132.08 ... 116.84 50.80 38.10 16.51 55.88 45.72 64.77 101.60 365 92
366 59.69 44.45 78.74 165.10 86.36 137.16 29.21 96.52 86.36 111.76 ... 140.97 45.72 33.02 5.08 53.34 50.80 64.77 99.06 366 93

366 rows × 808 columns

Add a column for the percent of “normal” snow depth for most recent day

  • Use the long-term daily median, as we did for a single site earlier

Add a column for total snow accumulation in February of the current year

  • Use a slice object of pandas Timestamps

  • You’ll want to isolate and include the days with positive snow accumulation (excluding days where no new snow fell)

    • Can then compute a sum!

Add a column for the long-term mean snow depth at each site

  • Note: to do this properly, probably want to remove any values near 0 (no snow on the ground) before computing the mean

    • Maybe a threshold of 1 cm?

Sanity check on columns and values

  • You may have slightly different column names and values here, as these stats were computed for an earlier query of original SNOTEL data

gdf.head()
code name network elevation_m site_property geometry Total Observation Count (days) Long-term Max Snow Depth (cm) 2021-02-13 Accumulation (cm) 2021-02-16 00:00:00+00:00 2021-02-16 Percent of Normal Snow Depth Feb 2021 Total Accumulation (cm) Long-term Mean Snow Depth (cm)
SNOTEL:301_CA_SNTL 301_CA_SNTL Adin Mtn SNOTEL 1886.712036 {'county': 'Modoc', 'state': 'California', 'si... POINT (-120.79192 41.23583) 8237.0 167.64 7.62 88.90 109.375000 30.48 53.379241
SNOTEL:907_UT_SNTL 907_UT_SNTL Agua Canyon SNOTEL 2712.719971 {'county': 'Kane', 'state': 'Utah', 'site_comm... POINT (-112.27118 37.52217) 8173.0 177.80 2.54 38.10 68.181818 17.78 47.404758
SNOTEL:916_MT_SNTL 916_MT_SNTL Albro Lake SNOTEL 2529.840088 {'county': 'Madison', 'state': 'Montana', 'sit... POINT (-111.95902 45.59723) 8924.0 241.30 5.08 106.68 97.674419 33.02 91.960994
SNOTEL:908_WA_SNTL 908_WA_SNTL Alpine Meadows SNOTEL 1066.800049 {'county': 'King', 'state': 'Washington', 'sit... POINT (-121.69847 47.77957) 6103.0 444.50 0.00 246.38 131.081081 127.00 150.637508
SNOTEL:302_OR_SNTL 302_OR_SNTL Aneroid Lake #2 SNOTEL 2255.520020 {'county': 'Wallowa', 'state': 'Oregon', 'site... POINT (-117.19258 45.21328) 7407.0 271.78 NaN NaN NaN 0.00 102.905574

Create some scatterplots to review these metrics

  • Hint: remove NaN records on the fly before plotting - see the dropna() method

    • Best to apply after isolating the column you want to plot, consider mydataframe[['max', 'geometry]].dropna().plot(column='max', ...)

  • Can plot separately, or create a figure with multiple subplots, and add titles/labels appropriately

  • I create a function to create the plot, and then looped over each column name I wanted to plot

for col in col_list:
    f, ax = plt.subplots(figsize=(10,10))
    ax.set_title(col)
    gdf[[col, 'geometry']].to_crs('EPSG:3857').dropna().plot(ax=ax, column=col, markersize=30, edgecolor='k', cmap='inferno', legend=True)
    ctx.add_basemap(ax=ax, source=ctx.providers.Stamen.Terrain)
/srv/conda/envs/notebook/lib/python3.8/site-packages/matplotlib/colors.py:1062: RuntimeWarning: invalid value encountered in true_divide
  resdat /= (vmax - vmin)
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_207_1.png ../../_images/08_Vector_TimeSeries_SNOTEL_exercises_207_2.png ../../_images/08_Vector_TimeSeries_SNOTEL_exercises_207_3.png ../../_images/08_Vector_TimeSeries_SNOTEL_exercises_207_4.png ../../_images/08_Vector_TimeSeries_SNOTEL_exercises_207_5.png ../../_images/08_Vector_TimeSeries_SNOTEL_exercises_207_6.png ../../_images/08_Vector_TimeSeries_SNOTEL_exercises_207_7.png

Replot percent of normal (descending sort)

  • Plotting order can be important when you have larger markers that overlap - you can end up with very different interpretations depending on values displayed on top of the plot

  • Let’s explorer this by reviewing the “percent of normal snow depth” plot with ascending and descending order

  • This should be pretty straightforward to sort and plot

    • gdf.to_crs('EPSG:3857')[[col, 'geometry']].dropna().sort_values(by=col, ascending=False).plot()

    • Need to set the column name appropriately

Replot percent of normal (ascending sort)

Interpretation: plotting order can be important, leading to different interpreations! For example, compare values over the Sierra Nevada in CA in the above two plots.

Extra Credit: Can you identify the SNOTEL site with:

  • Greatest long-term maximum snow depth

    • Is this necessarily the true maximum?

    • What happens if the snow gets higher than the instrument?

  • Greatest accumulation in February of current year

    • Where should I go skiing this weekend if I like deep snow?

Extra Credit: Repeat the above plotting for the WA state extent

Extra Credit: Limit your analysis to only include stations with >15 years of data

Part 8: Interpolating sparse values

  • Let’s create continuous gridded values (AKA Raster!) from our sparse point data

  • Note that we wouldn’t do this in practice, as there are much more sophisticated approaches one should use here, but let’s use this dataset to explore some basic interpolation approaches

  • We’ll use the scipy.interpolate.griddata function here, using ‘nearest’ to start, and scipy.interpolate.Rbf

Interpolate February accumulation pattern

import numpy as np
import scipy.interpolate

Specify the column name of variable to interpolate

#Most recent day
col = ts
#February
col= f'Feb {curr_y} Total Accumulation (cm)'

Review the comments/code in the following cells, then run

#Extract the column and geometry, drop NaNs
gdf_dropna = gdf.to_crs(aea_proj)[[col,'geometry']].dropna()
#Pull out (x,y,val)
x = gdf_dropna.geometry.x
y = gdf_dropna.geometry.y
z = gdf_dropna[col]
#Get min and max values
zlim = (z.min(), z.max())
#Compute the spatial extent of the points - we will interpolate across this domain
bounds = gdf_dropna.total_bounds
#Spatial interpolation step of 1 km
dx,dy = (10000,10000)
#Limit to WA state
#bounds = wa_state.total_bounds
#dx,dy = (1000,1000)
#Create 1D arrays of grid cell coordinates in the x and y directions
xi = np.arange(np.floor(bounds[0]), np.ceil(bounds[2]),dx)
yi = np.arange(np.floor(bounds[1]), np.ceil(bounds[3]),dy)
#Function that will plot the interpolated grid, overlaying original values
def plotinterp(zi):
    f, ax = plt.subplots(figsize=(10,10))
    #Define extent of the interpolated grid in projected coordinate system, using matplotlib extent format (left, right, bottom, top)
    mpl_extent = [bounds[0], bounds[2], bounds[1], bounds[3]]
    #Plot the interpolated grid, providing known extent
    #Note: need the [::-1] to flip the grid in the y direction
    ax.imshow(zi[::-1,], cmap='inferno', extent=mpl_extent, clim=zlim)
    #Overlay the original point values with the same color ramp
    gdf_dropna.plot(ax=ax, column=col, cmap='inferno', markersize=30, edgecolor='0.5', vmin=zlim[0], vmax=zlim[1], legend=True)
    #Overlay WA polygon
    #wa_state.plot(ax=ax, facecolor='none', edgecolor='white')
    ax.autoscale(False)
    states_gdf_proj.plot(ax=ax, facecolor='none', edgecolor='white')
    #Make sure aspect is equal
    ax.set_aspect('equal')
#Create 2D grids from the xi and yi grid cell coordinates
xx, yy = np.meshgrid(xi, yi)
#Interpolate values using griddata
zi = scipy.interpolate.griddata((x,y), z, (xx, yy), method='nearest')
plotinterp(zi)
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_229_0.png

Radial basis function interpolation

#Use Radial basis function interpolation
f = scipy.interpolate.Rbf(x,y,z, function='linear')
zi = f(xx, yy)
plotinterp(zi)
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_231_0.png

Explore this a bit

Write out the interpolated snow depth grid using rasterio (raster review)

#Grid origin (center of pixel?)
origin = (xi.min(), yi.min())
print(origin)
#Grid cell size
print(dx, dy)
(-746621.0, -986535.0)
10000 10000
#https://gis.stackexchange.com/questions/279953/numpy-array-to-gtiff-using-rasterio-without-source-raster
import rasterio as rio

transform = rio.transform.from_origin(xi.min(), yi.max(), dx, dy)

out_fn = 'snotel_interp_test.tif'
new_dataset = rio.open(out_fn, 'w', driver='GTiff',
                            height = zi.shape[0], width = zi.shape[1],
                            count=1, dtype=str(zi.dtype),
                            crs=gdf.crs,
                            transform=transform)

new_dataset.write(zi[::-1,], 1)
new_dataset.close()
!ls -l $out_fn
-rw-r--r-- 1 jovyan users 226802 Feb 26 23:09 snotel_interp_test.tif

Load from disk to verify

from rasterio.plot import show
with rio.open(out_fn) as src:
    print(src.profile)
    show(src)
{'driver': 'GTiff', 'dtype': 'float64', 'nodata': None, 'width': 158, 'height': 179, 'count': 1, 'crs': CRS.from_epsg(4326), 'transform': Affine(10000.0, 0.0, -746621.0,
       0.0, -10000.0, 793465.0), 'tiled': False, 'interleave': 'band'}
../../_images/08_Vector_TimeSeries_SNOTEL_exercises_238_1.png

Extra Credit: Clip interpolated grid to buffered radius around all stations

  • Clip the interpolated rasters within 20 km of stations

Extra Credit: Limit interpolation or clip to:

Extra Credit: Consider spatial autocorrelation for the network

Some samples:

Extra Credit: Snow depth vs. elevation analysis for WA

  • Let’s look at snow depth across WA on the most recent day in the record

  • Create a quick scatterplot of elevation vs. snow depth for all sites on this day

  • Do you see a relationship?

Try elevation vs. long-term mean

Extra Credit: Linear regression of snow depth vs. elevation

Extra Credit: Consider other variables that could affect snow depth

  • Distance from the coast (remember the Lab06 exercise calculating distance from WA perimiter?)

  • Mean winter daily temperature (from PRISM)

Extra credit (or, some additional items to explore)

If you have some time (or curiosity), feel free to explore some of these, or define your own questions. This is a really rich dataset, and those of you interested in snow or hydrology may have some cool ideas.

  1. Compute snow depth statistics across all sites grouping by Water Year (or by month/day range where snow is typically present)

  2. Identify date of first major snow accumulation event each year, date of max snow depth, date of snow disappearance - any evolution over time?

  3. Split sites into elevation bands and analyze various metrics

  4. Perform watershed-scale analysis

  5. Explore other interpolation methods for sparse data

  6. Create an animated map of daily accumulation in WA for the past two weeks

  7. Look at other variables for the SNOTEL sites (e.g., snow water equivalent, temperature data)

    • Note that WTEQ_D time series begin much earlier than SNWD_D

  8. Create maps of snow density for WA using WTEQ_D and SNWD_D