What is remote sensing?
Information obtained through long-range observation (e.g. from satellites, aircraft)
Collection of raw imagery from the surface of the Earth
Image processing
Advantages:
Remote sensing \(=\) raster data
Raster data structure \(\neq\) vector data structure
Raster data structure
Many variables of interest to public policy originate as remote sensing imagery
Example use case
But raster data were not (originally) built for social science and public policy applications
Divergent data structures, approaches
Social science prefers vectors
In social science and public policy, raster data integration requires that we either
Rasterize the vector data
convert discrete features into continuous field
examples:
Vectorize the raster data
summarize values of continuous field at each feature
examples:
Point-to-raster: suppose points are locations of 100 events (e.g. wolf attacks)
Point geometries
A major public policy problem
Option 1: count number of features in each raster pixel/cell
Point counts per cell
Pixels values are local frequency (number of points) or point density (number/area)
Local point frequency
Line-to-raster: suppose this line is an infrastructural object (e.g. road, power line)
Line geometries
Option 2: presence/absence of features at each raster pixel/cell
Line presence/absence per cell
Pixels values are indicators of whether an object is locally present/accessible
Local line access
Option 3: distance from feature to each raster pixel/cell
Distance from line to cell
Pixels values represent proximity
Local distance
Polygon-to-raster: suppose polygons are 4 administrative areas (e.g. districts)
Polygon geometries
Option 4: assign pixels to overlapping features (e.g. by center of cell)
Polygon assignment by centroid
Pixel values are polygon labels or attributes (e.g. assumed constant)
Local polygon assignment
Rasterization overview
These operations can be done on all types of vector data
But problem: why do this?
Raster-to-polygon: suppose raster represents a continuous variable (e.g. elevation)
Raster cell values
Option 1: calculate zonal statistics (e.g. average cell values) for each polygon
Zonal statistics: mean cell values
Average cell values are added to attribute table for polygons
Mean cell values for each polygon
Same operation could be used to obtain maximum cell values…
Maximum cell values for each polygon
…or minimum values (or any other summary statistic)
Minimum cell values for each polygon
But what if raster represents a categorical variable (e.g. land use)?
Raster cell values
Option 2: reclassify raster to binary (e.g. 1 if land use “A”, 0 otherwise)
Reclassified raster
Calculate zonal statistics: percent of each polygon with cell values of “A”
Zonal statistics: value “A” as percent of overlapping cells
Percentages are added to attribute table for polygons
Percent “A” per polygon
Scale-Pattern-Process
Scale of analysis (spatial, temporal) impacts which patterns are observable
Processes drive patterns whose observation is scale-dependent
some research questions require high spatial resolution:
some research questions require high temporal resolution:
some questions can be answered at low resolution (e.g. long-term, large-scale)
Trade-offs
How scale impacts rasterization and vectorization
High resolution (small pixels)
Low resolution (large pixels)
Point-to-raster: same underlying point pattern, two very different rasters
High resolution (small pixels)
Low resolution (large pixels)
Counts, densities will appear sparser (more intense) in high-(low-)resolution rasters
High resolution (small pixels)
Low resolution (large pixels)
Line-to-raster: same line features, two very different rasters
High resolution (small pixels)
Low resolution (large pixels)
Absence/presence measures are more (less) variable in high-(low-)resolution rasters
High resolution (small pixels)
Low resolution (large pixels)
High-(low-)resolution rasters more (less) precisely reflect shape of vector geometries
High resolution (small pixels)
Low resolution (large pixels)
Distance measures also have more (less) variation in high-(low-)resolution rasters
High resolution (small pixels)
Low resolution (large pixels)
Polygon-to-raster: same polygon features, two very different rasters
High resolution (small pixels)
Low resolution (large pixels)
Assignment operations are more (less) coarse in low-(high-)resolution rasters
High resolution (small pixels)
Low resolution (large pixels)
Some polygon features may disappear entirely in low-resolution rasters
High resolution (small pixels)
Low resolution (large pixels)
Raster-to-polygon: suppose we have two rasters with same underlying data
High resolution (small pixels)
Low resolution (large pixels)
Zonal statistics on the high-resolution raster will be more precise
High resolution (small pixels)
Low resolution (large pixels)
Low-resolution raster is more likely to generate missing values in polygon features
High resolution (small pixels)
Low resolution (large pixels)
Similar problems arise with zonal statistics on rasters with categorical variables
High resolution (small pixels)
Low resolution (large pixels)
Lower resolution \(\to\) fewer raster cells to calculate statistics over, less precision
High resolution (small pixels)
Low resolution (large pixels)
Low resolution rasters may sometimes also exaggerate amount of variation
High resolution (small pixels)
Low resolution (large pixels)
Why not always use highest-resolution raster data?
Which scale is right for me?