R for Spatial Analysis & Visualization

Author

Vinit Sehgal, Ph.D.

Published

January 16, 2024

1 Introduction

Statistical computing is essential for scientific inquiry, discovery, and storytelling. As the availability of dataset and access to computing has significantly increased over the recent years, the scope of scientific inquiry is restricted, only, by the imagination of the inquirer.

Scientific inquiry starts with observation. The more one can see, the more one can investigate” — Martin Chalfie

However, analysis of large-scale geospatial data (regional- to global scale at high spatial and temporal resolution) can be computationally expensive and time-consuming, especially when working with multiple formats and sources of data. R- a higher-level programming language, provides a powerful computational alternative to popular Geographic Information System (GIS) software to organize, analyze and visualize geospatial datasets. R enjoys a vast collection of open-source libraries for GIS-type operations and proven statistical analysis and data visualization capabilities.

In this course, we equip ourselves with hands-on knowledge of accessing, analyzing, and visualizing open-source satellite remote-sensing and geospatial datasets for hydrological, agricultural, and climatological studies within the R environment. The objective of this course is to learn R for:

  • Analyzing geospatial datasets (raster and vector),

  • Performing statistical analysis for each feature/ layer, and,

  • Mapping and visualizing spatial datasets.

The course will include the latest R tools for working with global Earth-observation datasets from several remote-sensing platforms, such as NASA’s MODIS, SMAP and LANDSAT. Basic operations of geospatial analysis such as (re)projection, (re)sampling, summary statistics, merge/join, and (re)shape will be covered in this course. The students will be introduced to structured/layered spatial datasets such as NetCDF/HDF formats used in climate modeling. We will explore several open-source resources for accessing and acquiring hydrometeorological and land dataset for plant, environmental and soil studies. Special emphasis will be placed on applying out-of-the-box parallel computing techniques with custom user-defined function for geospatial analysis.

We will first start with a refresher of basic R programming in Chapter 1. In Chapter 2 and 3, we will explore spatial data visualization, before we learn about large-scale application of parallel computing for geospatial analysis.

Purpose of this document:

This resource will serve as a dynamic class note, where students can access detailed concepts, codes and exercises related to this class. Notes will be updated regularly as the class progresses with up-to-date material and upcoming assignments.