Fieldwork and Research
Research Methods in Geography
Primary and Secondary Data
Primary data is collected firsthand by the researcher for the specific purpose of the investigation. Methods include field observations, measurements, questionnaires, interviews, and experiments. Primary data is directly relevant to the research question but can be time-consuming and expensive to collect.
Secondary data is data that has been collected by someone else for another purpose. Sources include government statistics (Census, ONS), academic publications, maps, satellite imagery, historical records, and media reports. Secondary data is readily available and often covers large areas or long time periods, but may not be precisely relevant to the research question and may be of variable quality.
Quantitative and Qualitative Methods
Quantitative methods produce numerical data that can be analysed statistically. Examples: measurements of river velocity, questionnaires with closed questions, sediment size analysis, pedestrian counts. Quantitative data is objective, replicable, and allows comparison and generalisation, but may oversimplify complex phenomena.
Qualitative methods produce non-numerical data (words, images, observations). Examples: interviews, open-ended questionnaires, field sketches, participant observation, photographs. Qualitative data provides depth, context, and rich description but is subjective, difficult to analyse systematically, and may not be generalisable.
Research Strategies
Case studies: in-depth investigation of a single location, event, or group. Provide rich, detailed data and are useful for exploring complex phenomena. Limited generalisability.
Comparative studies: investigation of two or more cases to identify similarities, differences, and patterns. Strengthen the ability to draw general conclusions but require careful selection of cases and control of confounding variables.
Longitudinal studies: data collection over an extended period (months to years). Allow the study of change over time but are time-consuming and vulnerable to attrition.
Cross-sectional studies: data collection at a single point in time. Efficient and practical but cannot establish temporal sequences or causal relationships.
Data Collection Techniques
Physical Geography Techniques
River studies:
- Width, depth, and velocity: measured at multiple points across the channel using a tape measure, ranging pole, and flow meter (impeller or floats). Systematic sampling across the channel provides data on channel geometry and discharge ().
- Sediment analysis: sediment samples collected from the riverbed, sieved to determine particle size distribution, and analysed using the Wentworth scale or phi () scale. Smaller, rounder particles indicate greater transport distance.
- Hydraulic radius: (cross-sectional area divided by wetted perimeter). Higher hydraulic radius indicates greater efficiency.
Coastal studies:
- Beach profiling: using a clinometer and ranging poles to measure slope angle at regular intervals from the backshore to the low water mark. Profiles can be compared across seasons or after storm events to assess erosion and deposition.
- Sediment analysis: pebble size, shape (roundness, sphericity using Cailleux or Powers index), and composition assessed at regular intervals along the beach.
- Longshore drift: measuring the direction and rate of sediment transport using tracer pebbles (painted or tagged pebbles released at a known point and recovered after a set period).
Ecosystem studies:
- Quadrat sampling: placing a quadrat ( or ) at systematic or random intervals within the study area to record species presence, abundance, or percentage cover.
- Transect sampling: recording data along a line (belt transect or line transect) across an environmental gradient (e.g., from low water mark to cliff top). Useful for studying zonation.
- Biotic indices: using indicator species to assess environmental quality (e.g., the Biological Oxygen Demand (BOD) indicator species for water pollution, lichen species as indicators of air quality).
Human Geography Techniques
Questionnaires: structured instruments using closed questions (Likert scales, multiple choice) for quantitative data, and open questions for qualitative data. Can be administered in person, by post, or online. Advantages: efficient, standardised, anonymous. Limitations: low response rates, social desirability bias, limited depth.
Interviews: semi-structured or unstructured interviews allow in-depth exploration of attitudes, perceptions, and experiences. Can reveal unexpected insights but are time-consuming, subjective, and difficult to analyse systematically.
Observations: systematic (structured, quantitative) or participant (qualitative, immersive) observation. Observation avoids the bias of self-report data but is limited to observable behaviour and may be influenced by the observer's presence (Hawthorne effect).
Pedestrian and traffic counts: counting people or vehicles at specific locations at set times to assess patterns of movement, land use, and accessibility. Provides objective quantitative data but captures only a snapshot.
Secondary data sources: Census data (population, employment, housing), deprivation indices (Index of Multiple Deprivation), crime statistics, health data, economic indicators.
Sampling Strategies
Random sampling: every member of the target population has an equal chance of selection. Eliminates sampling bias but requires a complete list of the population and may produce an unrepresentative sample by chance.
Systematic sampling: selecting every th item from a list at regular intervals. Simple and ensures even coverage but may coincide with a periodic pattern in the data.
Stratified sampling: the population is divided into strata based on relevant characteristics, and samples are drawn proportionally from each stratum. Ensures representativeness but requires knowledge of the population's composition.
Opportunity sampling: selecting whatever is readily available. Quick and convenient but produces a biased sample.
Cluster sampling: the population is divided into clusters (e.g., geographical areas), and a random sample of clusters is selected. All members of selected clusters are studied (or a random sample within each cluster). Practical for large or dispersed populations.
Data Analysis and Presentation
Graphical Techniques
- Bar charts: compare discrete categories
- Histograms: display the distribution of continuous data (bars touch, representing frequency density)
- Line graphs: show trends over time or along a transect
- Scatter graphs: display the relationship between two continuous variables; a line of best fit can be drawn to assess correlation
- Pie charts: show proportional composition (limited to a small number of categories)
- Choropleth maps: display spatial variation in a variable using shading or colour intensity
- Proportional symbol maps: symbols (circles, squares) sized in proportion to the data value at each location
- Triangular graphs: display three variables simultaneously (e.g., soil composition: sand, silt, clay)
- Radial diagrams: display data as sectors radiating from a central point, useful for comparing multiple variables at one location
Descriptive Statistics
Measures of central tendency:
- Mean: . Uses all data points but affected by outliers.
- Median: the middle value when data are sorted. Robust to outliers.
- Mode: the most frequent value. Useful for categorical data.
Measures of dispersion:
- Range: maximum value minus minimum value. Simple but affected by outliers.
- Interquartile range (IQR): . Robust to outliers; represents the middle of the data.
- Standard deviation: . Measures average deviation from the mean; uses all data points.
Inferential Statistics
Spearman's Rank Correlation Coefficient ()
Measures the strength and direction of the monotonic relationship between two ranked variables.
Where is the difference in ranks for each pair of observations, and is the sample size. ranges from (perfect negative correlation) to (perfect positive correlation). The significance of is tested against critical values at a chosen significance level (e.g., ).
Mann-Whitney U Test
A non-parametric test for comparing two independent samples. Tests whether one sample tends to have larger values than the other.
Where and are the sample sizes and is the sum of ranks in sample . The calculated value is compared to critical values to determine significance.
Chi-Squared () Test
Tests whether there is a significant association between two categorical variables.
Where is the observed frequency and is the expected frequency for each cell. Expected frequency . The calculated value is compared to critical values with the appropriate degrees of freedom ().
Student's t-Test
A parametric test for comparing the means of two groups (independent or paired). Assumes normally distributed data and equal variances. The independent t-test compares two unrelated groups; the paired t-test compares two related samples (e.g., before and after measurements).
Geographical Information Systems (GIS)
What Is GIS?
A GIS is a computer-based system for storing, analysing, manipulating, and displaying spatially referenced data. GIS integrates geographical data (location) with attribute data (characteristics).
Key GIS Functions
- Data input and storage: importing data from various sources (GPS, remote sensing, digitised maps, databases)
- Data management: organising, editing, and maintaining spatial databases
- Spatial analysis: buffer zones (areas within a specified distance of a feature), overlay analysis (combining multiple data layers), network analysis (shortest path, service area), spatial interpolation (estimating values at unsampled locations)
- Visualisation: creating maps, 3D models, and animations to communicate spatial patterns and relationships
Applications in Geography
- Physical geography: mapping land use change, modelling flood risk, analysing coastal erosion patterns, monitoring deforestation using satellite imagery
- Human geography: mapping population density, deprivation, transport accessibility, retail catchment areas, migration flows
- Fieldwork: displaying data collection points on a base map, analysing spatial patterns in primary data, creating layered maps that integrate multiple data sources
Remote Sensing
Remote sensing is the acquisition of information about the Earth's surface from a distance, typically using satellite or aerial sensors. Key applications:
- Land cover and land use mapping: classifying satellite imagery to identify vegetation, urban areas, water bodies, and bare soil
- Change detection: comparing imagery from different dates to identify deforestation, urban expansion, coastal erosion, or glacial retreat
- Vegetation monitoring: using the Normalised Difference Vegetation Index (NDVI) to assess plant health and biomass
Limitations of GIS
- Data quality depends on the accuracy, resolution, and currency of the source data
- GIS is a tool, not a theory: it does not explain why patterns exist, only where they are
- The "digital divide": access to GIS technology, training, and data is uneven, potentially excluding developing countries
- Spatial data can be politically sensitive (e.g., mapping disputed boundaries, sensitive infrastructure)
Evaluation of Research
Reliability
Reliability refers to the consistency and repeatability of data collection. To improve reliability:
- Standardise methods (same equipment, same procedure, same time of day)
- Train data collectors to ensure consistency
- Use pilot studies to identify and correct problems
- Collect sufficient sample sizes to reduce the influence of random variation
- Record methods in sufficient detail for replication
Validity
Validity refers to whether the data accurately measures what it is intended to measure.
- Internal validity: the extent to which the results truly reflect the phenomenon being studied, free from confounding variables
- External validity: the extent to which the findings can be generalised to other locations, populations, or times
To improve validity:
- Use appropriate sampling strategies to ensure representativeness
- Control for confounding variables (e.g., comparing sites with similar geology when studying slope processes)
- Triangulate (use multiple methods or data sources to cross-validate findings)
- Acknowledge and discuss limitations honestly
Limitations in Fieldwork
- Sampling bias: opportunity or convenience sampling may not represent the wider population or area
- Observer bias: the researcher's expectations may influence data collection or interpretation
- Temporal limitations: fieldwork is typically conducted over a short period and may not capture seasonal or long-term variation
- Equipment limitations: measurement instruments have finite precision and may introduce systematic error
- Access and safety: some locations may be inaccessible or dangerous, restricting data collection
- Ethical considerations: obtaining informed consent from human participants, respecting privacy, minimising environmental impact
Common Pitfalls
- Confusing correlation with causation in statistical analysis. A strong correlation between two variables does not mean one causes the other; a confounding variable may be responsible.
- Using an inappropriate statistical test for the data type or experimental design. The choice of test depends on whether the data are nominal, ordinal, or interval/ratio, and whether the design involves independent or related samples.
- Presenting data without contextual interpretation. Statistical results should be related back to the geographical theory and the research question.
- Failing to acknowledge limitations. Every study has limitations; acknowledging them strengthens the evaluation and demonstrates critical thinking.
- Using primary and secondary data interchangeably without discussing their different strengths, limitations, and potential inconsistencies.
- Confusing the mean, median, and mode, or reporting the mean when the data are skewed (the median is more appropriate for skewed distributions).
Practice Problems
Problem 1: Spearman's Rank Correlation
A student investigates the relationship between distance from the city centre () and house prices (index, --). The data are:
| Location | Distance () | House Price Index |
|---|---|---|
| A | 1 | 95 |
| B | 3 | 80 |
| C | 5 | 65 |
| D | 8 | 55 |
| E | 12 | 40 |
| F | 15 | 30 |
Calculate Spearman's rank correlation coefficient.
Ranking the data:
| Location | Distance rank | Price rank | ||
|---|---|---|---|---|
| A | 1 | 1 | 0 | 0 |
| B | 2 | 2 | 0 | 0 |
| C | 3 | 3 | 0 | 0 |
| D | 4 | 4 | 0 | 0 |
| E | 5 | 5 | 0 | 0 |
| F | 6 | 6 | 0 | 0 |
,
: a perfect negative correlation. As distance from the city centre increases, house prices decrease consistently. At , the critical value for is . Since , the correlation is statistically significant.
Problem 2: Chi-Squared Test
A geographer investigates whether land use varies between two areas. The observed frequencies are:
| Land Use | Area A | Area B | Row Total |
|---|---|---|---|
| Residential | 45 | 30 | 75 |
| Commercial | 20 | 35 | 55 |
| Industrial | 15 | 25 | 40 |
| Green Space | 20 | 10 | 30 |
| Column Total | 100 | 100 | 200 |
Calculate expected frequencies and the chi-squared statistic.
Expected frequency:
| Land Use | ||
|---|---|---|
| Residential | ||
| Commercial | ||
| Industrial | ||
| Green Space |
Degrees of freedom . Critical value at for is . Since , the result is statistically significant. There is a significant association between land use and area.
Problem 3: Sampling Strategy Evaluation
A student wants to investigate whether soil moisture content decreases with distance from a river. They have time to take measurements. Evaluate two sampling strategies.
Strategy 1: Random sampling. The student randomly selects points along the riverbank at varying distances and measures soil moisture at each point.
Advantages: eliminates sampling bias; results are statistically valid and can be generalised. Disadvantages: may not capture the full gradient (some distances may be over- or under-represented by chance); practical difficulties in accessing randomly selected points.
Strategy 2: Systematic sampling (transect). The student lays a transect perpendicular to the river and takes measurements at intervals from to from the riverbank.
Advantages: ensures even coverage of the distance gradient; efficient and practical; clearly shows the pattern of change with distance. Disadvantages: if the relationship is non-linear or if there are local anomalies (e.g., a spring, compacted path), the systematic interval may miss important features.
Recommendation: systematic transect sampling is more appropriate for this investigation because the research question specifically concerns the relationship between soil moisture and distance from the river. A systematic transect ensures that the full range of distances is sampled and provides a clear picture of the gradient. The data can be displayed as a line graph (soil moisture vs. distance) and analysed using Spearman's rank correlation.
Problem 4: GIS Application
Explain how a geographer could use GIS to investigate the impact of a new shopping centre on the surrounding area.
A GIS-based investigation could integrate multiple data layers:
-
Data collection and input:
- Digitise the location and boundaries of the new shopping centre
- Import data on pedestrian flows (before and after opening) from manual counts or automated sensors
- Import data on retail unit occupancy and types within the shopping centre
- Import Census data on household income, car ownership, and employment in surrounding areas
- Import data on existing retail centres (locations, sizes, types of shops)
-
Spatial analysis:
- Buffer analysis: create buffer zones around the shopping centre (e.g., , , ) to define zones of influence
- Overlay analysis: overlay the buffer zones with Census data to analyse the demographic characteristics of the shopping centre's catchment area
- Network analysis: use road network data to calculate drive-time isochrones (areas reachable within , , minutes by car), which are more meaningful than straight-line buffers
- Thiessen polygons: create catchment areas based on the nearest shopping centre to identify which existing centres have lost trade
-
Change detection:
- Compare retail vacancy rates, footfall data, and transport patterns before and after the shopping centre's opening
- Map the spatial distribution of shop closures in the town centre to identify whether decline is concentrated in specific areas
-
Visualisation:
- Create maps showing catchment areas, demographic profiles, and changes in footfall
- Produce 3D visualisations of the shopping centre's visibility and accessibility
Problem 5: Fieldwork Evaluation
A student conducted a river study at three sites along a river. At each site, they measured width, depth, and velocity. Critically evaluate the reliability and validity of this study.
Reliability:
Strengths:
- Standardised methods (same equipment, same measurement procedure) at each site improve consistency
- Multiple measurements across the channel (systematic sampling) reduce the influence of local anomalies
Limitations:
- Only one measurement at each site provides no assessment of temporal variation (seasonal changes, flood events). Repeated measurements over time would improve reliability.
- If different people measured at different sites, inter-observer variation could affect consistency
- Flow meters can be affected by debris or calibration drift; regular calibration is needed
Validity:
Strengths:
- The variables measured (width, depth, velocity) are directly relevant to the study of river processes (discharge, efficiency, hydraulic geometry)
- Multiple sites along the river's course allow the investigation of downstream changes, supporting geographical theory
Limitations:
- Only three sites may not capture the full complexity of downstream changes; additional sites would strengthen the analysis
- No consideration of confounding variables such as geology, land use, or tributary inputs, which could affect the results independently of distance downstream
- The study captures a snapshot in time; river characteristics vary with discharge, season, and weather. The findings may not be valid at different times of year
- No measurement of sediment load or channel roughness, which are important controls on river velocity and efficiency
Improvements:
- Increase the number of sites (e.g., --) for a more complete picture
- Conduct measurements at different times of year to assess seasonal variation
- Record confounding variables (geology, land use, tributaries) at each site
- Use a larger number of depth and velocity measurements across the channel
- Calibrate equipment before each data collection session