Remote sensing and public policy

Definitions


What is remote sensing?

 

Information obtained through long-range observation (e.g. from satellites, aircraft)

  1. Collection of raw imagery from the surface of the Earth

    1. passive sensors: collect information on emitted light/radiation
      • examples: photography, infrared
    2. active sensors: emit energy, collect information on reflected light/radiation
      • examples: radar, LiDAR
  2. Image processing

    1. raw images are georeferenced to ground control points
    2. emitted/reflected light matched to specific spectral signatures
      • examples: types of vegetation, land cover, CO\(_2\) emissions
    3. processed data are stored as pixels in raster datasets

 

Advantages:

  • remote sensing is sometimes cheaper and safer than direct observation
    (e.g. hard-to-reach areas, conflict zones)
  • measurement is consistent across regional, national borders

Remote sensing \(=\) raster data

  • not all raster data are derived from remote sensing imagery
  • but all remote sensing imagery originates as raster data

Raster data structure \(\neq\) vector data structure

  • vectors store information in “attribute tables” (\(N\) features \(\times\) \(K\) fields)
  • rasters store information in a grid of pixels (\(N_R\) rows \(\times\) \(N_C\) columns)
    • pixels are of constant size, shape, area
    • each pixel represents a unique location
    • each pixel contains just one value
      (e.g. precipitation, land use)
    • size of pixels determines resolution
      (e.g. 1 meter, 1 km, 1 degree)
  • rasters usually have larger file size than vectors, but not necessarily more precision


 

 

Raster data structure

Applications


Many variables of interest to public policy originate as remote sensing imagery

  • weather (precipitation, temperature)
  • climate model forecasts
  • flooding depth and risk
  • active fires
  • night light emissions
  • elevation, slope, line of sight
  • pollution and air quality
  • cloud cover
  • vegetation indices
  • soil quality, fertility
  • land use and land cover (LULC)
  • built-up areas
  • population density (derived from above)


 

 

Example use case


But raster data were not (originally) built for social science and public policy applications

  • original policy purpose: military reconnaissance, damage assessment
  • original scientific purpose: natural sciences (e.g. geology, ecology, biology)
  • no sensor systems, spectral measurements were designed for dedicated monitoring of social, economic processes
  • reliance on indirect/proxy measures

Divergent data structures, approaches

  • social science: “vector view” of world (e.g. organize data into discrete units)
  • natural science: “raster view” of world (e.g. organize data into regular lattice)
  • integrating raster and vector data requires interdisciplinary cooperation


 

 

 

Social science prefers vectors

Raster data analysis

Rasterization and Vectorization


In social science and public policy, raster data integration requires that we either

 

  1. Rasterize the vector data

    • convert discrete features into continuous field

    • examples:

      1. frequency/density of features
      2. presence/absence of feature
      3. distance to features
      4. assignment to feature
  2. Vectorize the raster data

    • summarize values of continuous field at each feature

    • examples:

      1. zonal statistics (e.g. mean, max cell values)
      2. image tracing (e.g. of georeferenced maps – we covered this earlier)

Point-to-raster: suppose points are locations of 100 events (e.g. wolf attacks)

Point geometries


Option 1: count number of features in each raster pixel/cell

Point counts per cell


Pixels values are local frequency (number of points) or point density (number/area)

Local point frequency


Line-to-raster: suppose this line is an infrastructural object (e.g. road, power line)

Line geometries


Option 2: presence/absence of features at each raster pixel/cell

Line presence/absence per cell


Pixels values are indicators of whether an object is locally present/accessible

Local line access


Option 3: distance from feature to each raster pixel/cell

Distance from line to cell


Pixels values represent proximity

Local distance


Polygon-to-raster: suppose polygons are 4 administrative areas (e.g. districts)

Polygon geometries


Option 4: assign pixels to overlapping features (e.g. by center of cell)

Polygon assignment by centroid


Pixel values are polygon labels or attributes (e.g. assumed constant)

Local polygon assignment


Rasterization overview

 

These operations can be done on all types of vector data

  1. count/density of points/lines/polygons
  2. presence/absence of points/lines/polygons
  3. distance to points/lines/polygons
  4. assignment to points/lines/polygons (with tie-breaking rule)

 

But problem: why do this?

  • pixels are not meaningful spatial units for public policy
  • policymakers don’t think of the world as a “continuous field”
  • policy is made in discrete geographic jurisdictions, with well-defined borders
  • more common approach to analysis: convert raster to vector

Raster-to-polygon: suppose raster represents a continuous variable (e.g. elevation)

Raster cell values


Option 1: calculate zonal statistics (e.g. average cell values) for each polygon

Zonal statistics: mean cell values


Average cell values are added to attribute table for polygons

Mean cell values for each polygon


Same operation could be used to obtain maximum cell values…

Maximum cell values for each polygon


…or minimum values (or any other summary statistic)

Minimum cell values for each polygon


But what if raster represents a categorical variable (e.g. land use)?

Raster cell values


Option 2: reclassify raster to binary (e.g. 1 if land use “A”, 0 otherwise)

Reclassified raster


Calculate zonal statistics: percent of each polygon with cell values of “A”

Zonal statistics: value “A” as percent of overlapping cells


Percentages are added to attribute table for polygons

Percent “A” per polygon

Scale-dependence


Scale-Pattern-Process

 

  1. Scale of analysis (spatial, temporal) impacts which patterns are observable

    • these observations shape inferences we draw about underlying social processes
  2. Processes drive patterns whose observation is scale-dependent

    • some research questions require high spatial resolution:

      1. urban/neighborhood policy
      2. bomb damage assessment
    • some research questions require high temporal resolution:

      1. emergency response
      2. weather forecasting
    • some questions can be answered at low resolution (e.g. long-term, large-scale)

      1. economic development
      2. deforestation, changes in land use

Trade-offs

  • lower resolution (large pixels) \(=\) more information loss
  • higher resolution (small pixels) \(=\) higher collection, storage, computation costs

How scale impacts rasterization and vectorization

High resolution (small pixels)


Low resolution (large pixels)


Point-to-raster: same underlying point pattern, two very different rasters

High resolution (small pixels)


Low resolution (large pixels)


Counts, densities will appear sparser (more intense) in high-(low-)resolution rasters

High resolution (small pixels)


Low resolution (large pixels)


Line-to-raster: same line features, two very different rasters

High resolution (small pixels)


Low resolution (large pixels)


Absence/presence measures are more (less) variable in high-(low-)resolution rasters

High resolution (small pixels)


Low resolution (large pixels)


High-(low-)resolution rasters more (less) precisely reflect shape of vector geometries

High resolution (small pixels)


Low resolution (large pixels)


Distance measures also have more (less) variation in high-(low-)resolution rasters

High resolution (small pixels)


Low resolution (large pixels)


Polygon-to-raster: same polygon features, two very different rasters

High resolution (small pixels)


Low resolution (large pixels)


Assignment operations are more (less) coarse in low-(high-)resolution rasters

High resolution (small pixels)


Low resolution (large pixels)


Some polygon features may disappear entirely in low-resolution rasters

High resolution (small pixels)


Low resolution (large pixels)


Raster-to-polygon: suppose we have two rasters with same underlying data

High resolution (small pixels)


Low resolution (large pixels)


Zonal statistics on the high-resolution raster will be more precise

High resolution (small pixels)


Low resolution (large pixels)


Low-resolution raster is more likely to generate missing values in polygon features

High resolution (small pixels)


Low resolution (large pixels)


Similar problems arise with zonal statistics on rasters with categorical variables

High resolution (small pixels)


Low resolution (large pixels)


Lower resolution \(\to\) fewer raster cells to calculate statistics over, less precision

High resolution (small pixels)


Low resolution (large pixels)


Low resolution rasters may sometimes also exaggerate amount of variation

High resolution (small pixels)


Low resolution (large pixels)


Why not always use highest-resolution raster data?

  • high-res data may not exist (due to orbital requirements, low user demand)
  • high-res satellite data are sometimes inaccessible (classified, proprietary)
  • high-res data are expensive to collect, transmit, store (terabytes, petabytes)
  • high-res data take up a lot of memory, need high-performance computing
  • high-res data may not be needed to answer research question
    (don’t need 1-meter resolution to study regional, national, global phenomena)

Which scale is right for me?