Overview
What steps are involved in a research project?
Different steps will be more/less important for different types of projects
What will we do this week?
Illustrative example: a research project on Islamic State violence in Iraq and Syria
Review methods:
Introduce new methods:
QGIS step-by-step & R replication code on Canvas (no problem set!)
Selecting (and refining) a research question
“What” versus “why”
describe what happened:
explain why something happened:
Descriptive \(\neq\) atheoretical
Illustrative example:
Rise and fall of the Islamic State
Research question:
Why was ISIS more active in some parts of Iraq and Syria than in others?
Capture the flag
hypothesis
noun, plural hypotheses [www.dictionary.com/browse/hypothesis]
A proposition, or set of propositions, set forth as an explanation for the occurrence of some specified group of phenomena
What makes a “good” hypothesis?
Hypothesis 1: state weakness
A road to perdition
Hypothesis 2: demographics
A built-up area
Hypothesis 3: political economy
A steady paycheck
Hypothesis 4: key infrastructure
A critical object
Hypothesis 5: sectarian divisions
A potential base of support
Data collection: categories of geospatial data
“Off-the-shelf” geospatial data (ready to use)
(e.g. 500+ sites with free GIS data: freegisdata.rtwilson.com
)
Raw geospatial data
Illustrative example
What is our geographic unit of analysis?
Administrative level 0 | Administrative level 1 | Administrative level 2 | Other |
---|---|---|---|
e.g. country | e.g. province/state | e.g. district/county | e.g. grid cell |
\(\checkmark\) |
Illustrative example
What outcome are we trying to explain?
Hypotheses | Dependent variable | Data needed | Format |
---|---|---|---|
1-5 | Number of ISIS attacks per district | event locations | vector |
Illustrative example
What data on explanatory variables do we need to test our hypotheses?
Hypothesis | Explanatory variable | Data needed | Format |
---|---|---|---|
1. Power projection | road density | roads | vector |
2. Demographics | local population size | population | raster |
3. Political economy | % of land used for agriculture | land cover/use | raster |
4. Infrastructure | proximity to dams | dam locations | vector |
5. Sectarian division | local presence of Sunni Arabs | ethnic settlement | vector |
Pre-processing
Having the data in the hand does not mean you’re ready for analysis
Common pre-processing tasks:
Merging datasets (e.g. join 2 tables by a common field/variable)
Queries and subsets
Overlay operations
Simplification and generalization
What kind of evidence is needed to confirm/reject a hypotheses?
Compare empirical observations to theoretical expectations
Expectation | Observation | |
---|---|---|
\(X\) is positively associated with \(Y\) | positive correlation between \(X\) and \(Y\) | |
\(X\) is positively associated with \(Y\) | negative correlation between \(X\) and \(Y\) |
Methods for hypothesis testing:
Visual inspection of maps
Descriptive statistics (e.g. difference in means)
Statistical graphics (e.g. box plots, bar plots, histograms)
Statistical modeling (e.g. multivariate regression)
There is no silver bullet! Best practice is to use multiple methods.
Illustrative example
Use regression analysis to test all 5 hypotheses at once \[\begin{align*} \text{violence}_i=&\beta_1 \text{road density}_i + \beta_2 \text{population}_i +\beta_3 \text{cropland}_i \\ &+\beta_4 \text{dams}_i + \beta_5 \text{Sunni presence}_i + \epsilon_i \end{align*}\] where
Hypothesis | Expectation | Observation |
---|---|---|
1. Power projection | \(\beta_1<0\) | ? |
2. Demographics | \(\beta_2>0\) | ? |
3. Political economy | \(\beta_3<0\) | ? |
4. Key infrastructure | \(\beta_4>0\) | ? |
5. Sectarian divisions | \(\beta_5>0\) | ? |
Discussion
What are the results of the analysis?
What are the broader implications of these results?
Let’s go! (switch to lab)