Appendix A: Data Processing

In general, links to original data sources are provided in each section below. Because data files are routinely updated, the files as downloaded and used in this document are also available online. These files are provided for reproducibility and transparency; for your own analyses, we encourage you to seek out data from the original linked sources.

A.1 Geographic Setting

A.1.1 Watersheds

Jump to Chapter 1

A.1.1.1 HUCs

stuff

A.1.1.2 MDEQ Basins

Shapefiles were provided by the Mississippi Department of Environmental Quality in August 2025. The zipped shapefiles are available in the DEQ_basins folder of the msepCharacterization data files directory. However, we encourage you to contact MDEQ to get the most up-to-date files for your own analyses.

A.1.2 EPA Ecoregions

Jump to Chapter 2

Level 3 and 4 Ecoregion shapefiles were downloaded from the EPA’s website on 16 June 2025 for the states of MS, AL, and LA.

The original downloaded files along with their metadata files are available in the EPA EcoRegions folder of the msepCharacterization data files directory.

Those files were read in, combined, and clipped to the Mississippi Sound Watershed boundaries (using the outline_full object from the {msepBoundaries} package) using the script EPAEcoRegion_subsetting.R (on github) in the R_preprocessing folder. Three objects resulted: level3_lamsal, level 3 ecoregions for the 3 states combined; level3_mssoundwatershed, level 3 ecoregions trimmed to only the Mississippi Sound Watershed area; and level4_mssoundwatershed, level 4 ecoregions trimmed to only the Mississippi Sound Watershed area. The level 4 ecoregions for the entire three states were not retained because there were so many categories; it really only makes sense to look at these at the smaller level.

The resulting objects were saved as layers of a geopackage, EPA_EcoRegion.gpkg in the data/processed folder. These files are available in the processed folder of the msepCharacterization data files directory.

A.2 Climatology

A.2.1 Precipitation

Jump to Chapter 3

Precipitation normals (over 15 and 30 years, and a 100-year baseline: 2006-2020, 1991-2020, and 1901-2000 respectively) were downloaded as “gridded normals” from the National Centers for Environmental Information Climate Normals website as netCDF (.nc) files. The downloaded files as used are available in the Precipitation_NCEI folder of the msepCharacterization data files directory.

These were subsetted to only monthly and annual value layers (removing layers with sd, min, max, and flags) and trimmed geographically to the states of MS, LA, and AL in the file Precipitation_subsetting.R (on github) in the R_preprocessing folder.

The resulting files were saved as netCDF files prcp_15yearNormals.nc, prcp_30yearNormals.nc, and prcp_100yearBaseline.nc in the data/processed folder. These files are available in the processed folder of the msepCharacterization data files directory.

A.2.2 Temperature

Jump to Chapter 4

Average Temperature normals (over 15 and 30 years, and a 100-year baseline: 2006-2020, 1991-2020, and 1901-2000 respectively) were downloaded as “gridded normals” from the National Centers for Environmental Information Climate Normals website as netCDF (.nc) files. The downloaded files as used are available in the Temperature_NCEI folder of the msepCharacterization data files directory.

These were subsetted to only monthly and annual value layers (removing layers with sd, min, max, and flags) and trimmed geographically to the states of MS, LA, and AL in the file Temperature_subsetting.R (on github) in the R_preprocessing folder.

The resulting files were saved as netCDF files tempAvg_15yearNormals.nc, tempAvg_30yearNormals.nc, and tempAvg_100yearBaseline.nc in the data/processed folder. These files are available in the processed folder of the msepCharacterization data files directory.

A.3 Hydrology

A.3.1 Streams and Waterbodies

Jump to Chapter 5

A.3.1.1 Stream Lengths and Waterbody Areas

Stream Lengths and Waterbody Areas were calculated from the EPA’s NHDPlus dataset. Files for the South Atlantic (03f) and Lower Mississippi River (03g) regions were downloaded on 12 June 2025. The original downloaded files are available in the EPA NHDPlus folder of the msepCharacterization data files directory.

Those files were read in, combined, and clipped to only the Mississippi Sound Watershed boundaries (using the outline_full object from the {msepBoundaries} package) using the script EPANHD_subsetting.R (on github) in the R_preprocessing folder. The resulting files were saved as geopackages EPA_NHDplus_flowline_MSEP.gpkg and EPA_NHDplus_waterbody_MSEP.gpkg in the data/processed folder. These files are available in the processed folder of the msepCharacterization data files directory.

A.3.1.2 Designated Uses

Shapefiles of designated uses, already limited to the portion of the Mississippi Sound Watershed that lies within the state of MS, were provided by the Mississippi Department of Environmental Quality (MDEQ) in April 2025. These files provide the designated uses for each waterbody from the 2015 Water Quality Standards dataset. The Mississippi Sound Watershed is made up of MDEQ’s Pearl River, Pascagoula River, and Coastal Streams basins. Per MDEQ, “all waters in those 3 basins not specifically listed in the data provided are classified as Fish and Wildlife.” See their website for more information.

The zipped shapefiles are available in the DEQ_designated_uses folder of the msepCharacterization data files directory. However, we encourage you to contact MDEQ to get the most up-to-date files for your own analyses.

A.3.1.3 Impairments

Not done yet but see MDEQ’s online TMDL Tool.

A.3.2 Freshwater Inflows

Jump to Chapter 6

Stuff

A.3.3 Salinity

Jump to Chapter 7

Seasonal salinity summaries, according to the “Dynamic Five-Zone Salinity Scheme”, were downloaded as shapefiles from the Gulf Data Atlas on 3/19/2025. Only metadata files and files beginning with “MS” were retained (to keep only Mississippi Sound data). The shapefile and related information are in the salinity_gulfDataAtlas subfolder of the processed folder of the msepCharacterization data files directory.

The description of this data layer, also available online, is:
> This is an ArcGIS shapefile which depicts the seasonal salinity dynamics of 32 Gulf of Mexico estuaries. To characterize the dynamic nature of estuarine salinity gradients, a multivariate methodology (Bulger et al. 1993) was applied to derive five bio-salinity zones in four salinity seasons for 32 Gulf of Mexico estuaries (Christensen et al. 1997). This seasonal salinity zone spatial framework built upon and refined earlier studies which characterized salinity on an annual-averaged basis (NOAA 1985, Orlando et al. 1993, NOAA 2007). Precipitation, flow gage data, and monthly salinity averages were evaluated to determine which months would be used to represent the high, low, and transitional (increasing and decreasing) salinity periods. A contour modeling procedure was applied to the data to develop seasonal salinity zones for each estuary. The salinities used to define the five seasonal zones were: 1) Salinity Zone I: 0 - 0.5 ppt; 2) Salinity Zone II: 0.5 - 5 ppt; 3) Salinity Zone III: 5-15 ppt; 4) Salinity Zone IV: 15-25ppt; and 5) Salinity Zone IV: >25ppt. These salinity zones are two-dimensional and depth-averaged, and vertical stratification is not explicitly characterized. Therefore, they can be readily represented geographically as two-dimensional areas, which shift seasonally. The monthly periods of high, low, increasing and decreasing salinity seasons vary greatly among estuaries, primarily because of different typical periods of high and low freshwater inflow. For example, the low salinity season in Galveston Bay, Texas occurs in April - June, while in Mobile Bay, Alabama, the low salinity season occurs in February - April.

A.4 People and Land Use

A.4.1 Population

Jump to Chapter 8

A.4.1.1 Dasymetric Population Estimates

The EPA EnviroAtlas’s dasymetric population datasets “intelligently reallocate … population from census blocks to 30 meter pixels based on land cover and slope”. The national datasets for 2010 and 2020 were downloaded as raster files on 24 July 2025. Only the 2020 dataset has been used here.

The original downloaded file (zipped; ~3.6 GB) is available in the Population folder of the msepCharacterization data files directory.

The dataset was trimmed to Mississippi Sound Watershed boundaries (using the outline_full object from the {msepBoundaries} package) using the script Population_2020_subsetting.R (on github) in the R_preprocessing folder.

The resulting file, Population_Dasymetric_2020.tif, is available in the processed folder of the msepCharacterization data files directory.

A.4.2 Land Use / Land Cover

Jump to Chapter 9

Land Use/Land Cover data was downloaded from the USGS Annual National Land Cover Database in April 2025. Each pixel represents 30x30m. Citation: U.S. Geological Survey (USGS), 2024, Annual NLCD Collection 1 Science Products (ver. 1.1, June 2025): U.S. Geological Survey data release, https://doi.org/10.5066/P94UXNTS.

The original downloaded file, Annual_NLCD_LndCov_2023_CU_C1V0.tif (~1.4 GB), covers the entire continental US and is available in the NLCD folder of the msepCharacterization data files directory.

The dataset was trimmed to Mississippi Sound Watershed boundaries (using the outline_full object from the {msepBoundaries} package) using the script NLCD_subsetting.R (on github) in the R_preprocessing folder. Factor levels associated with the NLCD categories were also associated during this process and are part of the auxiliary file that loads with the dataset in R.

The resulting files, NLCD_MSEP.tif and NLCD_MSEP.tif.aux.xml (the file with associated factor levels), are available in the processed folder of the msepCharacterization data files directory.

About

The .qmd file that generated this section was: data_processing.qmd.