Network Science Institute | Northeastern University
NETS 7983 Computational Urban Science
2025-03-31
Computational Urban Science is primarily concerned with spatially embedded features of cities.
Some features are explicitly spatial: commuting and infrastructure networks, physical amenity visitation.
Some features are influenced by space: Social, communication, employment / opportunity networks.
The spatial structure of urban data requires special consideration.
Consider Tobler’s first “law” of Geography: “Everything is related to everything else, but near things are more related than distant.”
In CUS, we want reliable, repeatable insights about urban systems. If everything is related to everything else:
How can we achieve reliable statistical estimates of the relationship between urban variables?
How can we measure causal relationships in spatially-interconnected systems?
In Week 2, we discussed the Modifiable Areal Unit Problem (MAUP) and the effect of scale for defining analytical conclusions.
Today, we will use many definitions of proximity and adjacency as we aim to encode the spatial structure of our data into our analysis.
How do we define which things are “near” one another as described in Tobler’s law? Euclidean distance? Geodesic distance? Travel time? Semantic distance?
How do we define adjacency? \(k\) nearest-neighbors? What is \(k\)? What about physical boundaries between physically adjacent features?
Like MAUP, appropriate definitions of spatial structure in your data require your own scientific judgment.
A tale of two cities: London’s rich and poor in Tower Hamlets
A tale of two cities: London’s rich and poor in Tower Hamlets
Is Tobler’s First Law a Law? I prefer “empirical regularities”.
Spatial features have consistent, repeated patterns which should inform how you address statistical and causal inference and other analyses of spatial data.
Some of these regularities are:
Spatial autocorrelation (a.k.a. “clustering” or spatial heterogenity)
Spatial nonstationarity (variation of statistical relationships across space)
Physical constraints on network structure
Tobler’s first law revisited: “…near things are more related than distant.”
This is an empirical observation which holds true for a wide range of spatial phenomena.
Spatial autocorrelation permits:
Spatial autocorrelation hinders:
A funny example: Inverse-distance Weighting (IDW) (1965) beats Google Research’s (2024) elevation predictions:
General Geospatial Inference with a Population Dynamics Foundation Model
Spatial variogram: how much do two observations vary by distance?
Useful for assessing degree of spatial autocorrelation of continuous spatial variables.
Moran’s I
Local Indicators of Spatial Association (LISA)
Another feature of spatial data: statistical relationships can vary across space
There are multiple techiques to address spatial autocorrelation and nonstationarity:
Geographically weighted regression:
Fixed effects models:
We used one last week!
Used to handle unobserved location-specific variation that impacts dependent variables. Only allows interpretation of within-unit effects.
More on GWR and spatially-aware statistical inference next week!
Most geostatistical analysis happens within a constrained spatial boundary
For proximity- or adjacency-based statistical methods (like GWR):
Spatial clustering techniques account for spatial proximity when defining clusters.
Supports varying cluster density (producing varying size clusters).
Spatial clustering is useful for: dimensionality reduction of spatial features and for detecting spatial outliers.
In practical 5-1: note the difference between K-means clusters and geographically contiguous SKATER clusters (SKATER accounts for spatial proximity).
DBSCAN - Density Based Spatial Clustering of Applications with Noise
Most common spatial clustering algorithms:
For \(minPts = 4\), \(\varepsilon\) indicated by circle radius. Red: core points, Yellow: border points, Blue: outlier.
Spatial networks have unique characteristics driven by their spatial embeddedness.
Edge formation is driven by physical cost
Therefore, spatial networks typically have fewer long-range ties compared to non-spatial networks
Spatial networks are typically described by weighted, directed networks
Spatial networks tend to have hierarchical structure
Hub structures, modularity in spatial networks results from the benefits of co-location and hierarchical organization.
Source: The Origins of Scaling in Cities [1].
CUS 2025, ©SUNLab group