Code
sockeye <- readRDS("data/HansenSockeye.rds")Many environmental data are time series. Temperature records. Streamflow. Tree rings. CO2 measured at Mauna Loa. All of it indexed to time, all of it carrying a signature of the processes that generated it.
But it goes further than that. Even data sets that aren’t obviously temporal – a soil sample, a species survey, a water chemistry measurement – exist in time. They were collected at a moment, and the system that produced them has a history. Time is the dimension that almost every environmental process moves through, and most of those processes have structure in time: cycles, trends, memory, responses to forcing. Learning to read that structure is part of what it means to do environmental science.
It also matters for getting the statistics right. Temporal dependence – the fact that observations close together in time tend to be more similar than observations far apart – violates the independence assumption that underlies most standard statistical methods. Ignore it, and your standard errors are too small, your p-values too optimistic, your conclusions too confident. Time series analysis gives you the tools to account for that dependence rather than pretend it isn’t there.
And beyond the bookkeeping, temporal structure is information. A seasonal cycle tells you something about the drivers of a system. A long-term trend tells you something has changed. Lagged correlations between two series can reveal how one system influences another. The goal of this course is to help you extract that information, work with it honestly, and communicate what it means.
These are the course notes and not a textbook. They are more like a field guide written by someone who has gotten lost in these woods before and wants to help you find your way through.
The notes are hands-on by design. You will write code, look at data, fit models, and interpret results. Each section builds on what came before. The math shows up when it needs to, but the emphasis throughout is on doing time series analysis and understanding what you’re doing – not on deriving things for their own sake.
I’ll point you to textbooks and papers along the way. Read them. They complement what we’re doing here and will matter when you’re working on your own data and I’m not around to ask.
These notes are a living document. They change based on what works, what doesn’t, and what you ask about in class. If something is wrong or unclear, tell me.
Most of the chapters follow a natural progression – each one builds on the last, and working through them in order is the right call. But scattered through the notes you’ll find Asides: short detours that dig into something that doesn’t quite fit the main flow. An Aside might work through the math behind a method, explain how R handles something under the hood, or fill in background that makes the surrounding chapters make more sense.
You can skip the Asides and still follow the core material. But they’re there because the questions they answer are real ones – the kind that come up when you’re working through a chapter and start wondering why something works the way it does. If that’s how your brain works, the Asides are for you.
The notes move through the core ideas of time series analysis in roughly this order:
plot() and summary() depending on the class of the object you give them. Useful background for the whole course.You are going to learn to see the world in time. That sounds dramatic, but I mean it practically: by the end of this course you will be able to look at a time series, describe its structure, fit a model, check your assumptions, and say something about what the data suggest.
The people who do this work well are not necessarily the ones who find it easiest. They’re the ones who run the code when it breaks, read the error messages, ask questions, and keep going.
So: set up your project, download the data, and let’s get to work.
This document was written in Markdown using quarto and built with R version 4.6.0. You should be reasonably up to date on your versions of R, RStudio, and relevant packages. You can update your packages by running:
Run that now. And anytime it occurs to you. It’s always a good idea to be up to date with your packages.
To follow along with the examples, you’ll want a working RStudio project.
Create a new RStudio project
Go to File → New Project → New Directory → New Project. Give it a name (for example timeseries-course) and choose where to save it.
Download the data/ folder
The datasets used in the examples are bundled into a single data.zip. Download it and unzip it inside your project directory.
You can download the data directly:
https://timeseries.andybunn.org/data.zip
Once it’s unzipped your folder structure should look something like this:
timeseries-course/
├── data/
│ ├── HansenSockeye.rds
│ ├── jul65N.rds
│ └── ...
└── timeseries-course.Rproj
In your code, use paths like "data/HansenSockeye.rds" rather than full file paths. This keeps the code portable and ensures it will run on different machines without modification. E.g.,
When you create Rmd or qmd files for assignments, save them in the project’s root directory. Your folder structure might eventually look something like this:
timeseries-course/
├── data/
│ ├── HansenSockeye.rds
│ ├── jul65N.rds
│ └── ...
├── decompositionHomework.Rmd
├── forecastingHomework.Rmd
└── timeseries-course.Rproj