Our immediate goal is to collect data on administrative-territorial changes in Soviet Ukraine. The downstream analytical goal is to better understand why countries redraw their internal administrative borders, and what sorts of political, economic and social legacies these changes leave behind.
There are several types of boundary changes: create, merge, split, abolish. These changes can apply to legislative, jurisdictional, and administrative borders. These changes happen for a variety of reasons, from technocratic “optimization” and demographic changes, to political survival.
Like many countries, the Soviet Union frequently changed its internal administrative boundaries throughout its existence, driven by political, economic, and ethnic considerations. These boundary changes varied in the extent to which pre-existing communities were kept intact between the old and new maps. For example, the USSR sometimes consolidated pre-existing political communities into larger units (Checheno-Ingush ASSR), but other times carved them up between neighboring provinces, wiping away all internal borders, leaving no trace of their existence (Volga German ASSR).
We will assemble data on these changes using declassified Soviet gazetteers. A gazetteer is a geographical dictionary or directory that provides detailed information about places, including names, locations, administrative divisions, and sometimes historical or cultural details. Ideally, we will be able to cover the full period of Soviet Ukrainian history from the 1920s to 1991. Our first priority will be to collect data on the pre-WWII period, 1921-1939.
Below is a set of instructions on how to create tables of historical administrative units from scanned PDFs of declassified archival gazetteers, using generative AI.
netid@georgetown.edu
) by selecting “Continue with Email.”.edu
addresses are automatically upgraded to the “Perplexity Pro” tier for one month or longer when they sign up, and are eligible for a discounted rate of $4.99/month when the trial period expires.YZRA/Data/ATD/Raw/AI_Ready
directory in DropboxYEAR_FileDescription_01.pdf
, so that sorting them alphabetically also sorts them chronologically
_01
, _02
, etc.), due to memory and response length limits on Perplexity1921_SpVolUSSR_*
Please convert the attached PDF list of governorates, districts and settlements into
a csv table. Column names should be in English snake_case. Table contents should
preserve original Cyrillic characters.
1925_adminterpodil_*
Please extract the contents of the table(s) of regions and districts from the attached
PDF into csv table(s). Column names should be in English snake_case. Table contents
should preserve original Cyrillic characters.
1926_TerAdmPodSSSR_*
Please convert the attached PDF list of districts and district centers (grouped by
region) into a csv table. Column names should be in English snake_case. Table contents
should preserve original Cyrillic characters.
1930_AdmTerDelSSSR_*
Please extract the contents of the table(s) of regions and districts from the
attached PDF into csv table(s). Column names should be in English snake_case. Table
contents should preserve original Cyrillic characters.
1931_AdmTerDelSSSR_*
Please extract the contents of the table(s) of regions and districts from the
attached PDF into csv table(s). Column names should be in English snake_case. Table
contents should preserve original Cyrillic characters.
1933_AdmTerPod_*
Please convert the attached PDF list of district characteristics and regions into a
csv table. Column names should be in English snake_case. Table contents should
preserve original Cyrillic characters.
1935_AdmTerSSSR_*
Please extract the contents of the table(s) of regions and districts from the
attached PDF into csv table(s). Column names should be in English snake_case. Table
contents should preserve original Cyrillic characters.
1936_RayUSRR_*
Please convert the attached PDF table of contents into a csv table of regions and the
districts/okrugs they contain. Column names should be in English snake_case. Table
contents should preserve original Cyrillic characters.
1937_AdmTerDelSSSR_*
Please extract the contents of the table(s) of districts and towns (grouped by
region) from the attached PDF into csv table(s). Column names should be in English
snake_case. Table contents should preserve original Cyrillic characters.
1940_AdmTerDelSSSR_*
Please extract the contents of the table(s) of districts and towns (grouped by
region) from the attached PDF into csv table(s). Column names should be in English
snake_case. Table contents should preserve original Cyrillic characters.
1936_RayUSRR_01.pdf
).text
region,district_okrug
КИЇВСЬКА ОБЛАСТЬ,Київська м/р
КИЇВСЬКА ОБЛАСТЬ,Житомірська м/р
КИЇВСЬКА ОБЛАСТЬ,Андрушівський
КИЇВСЬКА ОБЛАСТЬ,Бабанський
КИЇВСЬКА ОБЛАСТЬ,Базарський
КИЇВСЬКА ОБЛАСТЬ,Баришівський
КИЇВСЬКА ОБЛАСТЬ,Березанський
КИЇВСЬКА ОБЛАСТЬ,Білоцерківський
...
Sonar
, Claude
, Gemini
) and wait for Perplexity to re-run the query.YZRA/APT/Data/Processed
1936_RayUSRR_01.pdf
1936_RayUSRR_01.md
1936_RayUSRR_02.pdf
1936_RayUSRR_02.pdf
1936_RayUSRR_02.md
1936_RayUSRR_03.pdf
, 1936_RayUSRR_04.pdf
, etc.)1936_RayUSRR_01.md
and1936_RayUSRR_02.md
in the Processed
folder.To Do
to Working
and Done
as you go.