Today’s objectives
Data (plural): organized collections of observations or measurements
(e.g., official government statistics, crowd-sourced battlefield reports, social media posts, photo albums, public opinion polls, maps, scores)
We use data to answer questions and inform decision-making.
Examples of data applications:
Territorial control in Ukraine (today)
Territorial control data extract for New Year’s Eve, 2022
This table (and the preceding map) are from VIINA, a near-real time multi-source event data system tracking the Russian-Ukrainian War.
Available here: https://github.com/zhukovyuri/VIINA
This table is a daily extract from a panel dataset, where the same towns and villages (indexed by geonameid
) are observed at multiple time points (date
), enabling analysis of temporal dynamics and spatial differences.
War-related events in Ukraine (2/24/2022 – today)
Event data extract for New Year’s Eve, 2022
This extract is from VIINA’s event dataset, where the each row is the location, timing, attributes of a single incident, with source info.
VIINA is an example of an open-source data project.
Open-source data: information that is unclassified and non-proprietary, accessible through public channels without special permissions/clearances
Examples:
Raw data: original, unprocessed information (e.g., images, webpages, books, transcripts), requiring some cleaning or transformation before use.
Processed data: raw information after it has been cleaned, organized and stored for efficient retrieval, interpretation, and analysis.
Storage options for processed data:
csv
, json
, xml
) (simple, portable text files; easy for basic storage and transfer; can open/edit them in Excel/GoogleDocs)We will be working with delimited text files only in this class.
The 1937 All-Soviet Census
Discussion
How might censored or falsified data affect government decision-making and public policy?
Deceitful numbers
Data use in Stalin’s USSR
How the Kremlin keeps its secrets
Don’t chatter
Data type 1: Government statistics
Examples:
eng.rosstat.gov.ru
)cikrf.ru
)demoscope.ru
)
Not a bell curve
Data type 2: Administrative records
Examples:
pamyat-naroda.ru
)ovd.info/en
)
Another data point
Data type 3: Public opinion surveys
Reliability of survey data in Russia
Examples:
levada.ru
)
Putin approval (Levada)
Data type 4: Text data
Examples:
militera.lib.ru
soldat.ru
kremlin.ru
Poems can be data
Data type 5: Geospatial data
Examples:
data.mos.ru
), St. Petersburg (gov.spb.ru
)davidrumsey.com
, tinyurl.com/28a7shm5
freegisdata.rtwilson.com
For official use only
Data type 6: Non-geographic images
Examples:
Who’s who
Data type 7: The dark side
In academic research and teaching, we cannot use such data due to privacy, consent, and institutional restrictions
Examples:
bellingcat.com
)
Geolocated Buk-332
Data best practices
snake_case
” for maximum compatibility (e.g., putin_approval
, region_name
, year
, month
).csv
).pdf
)
project_folder/
|-- data/
|-- raw/
| |-- levada_survey_2025.csv
| |-- rosstat_demographics.xlsx
|-- processed/
| |-- combined_data.csv
|-- documentation/
|-- codebook.pdf
NEXT MEETING
Economic Foundations: Land, Labor and Serfdom (Th, Sep. 11)