Network Science Institute | Northeastern University
NETS 7983 Computational Urban Science
2025-02-07
In this lecture, we will present the main concepts of Computational Social Science (CUS) and its applications to Urban Science. Here is the summary
Large-scale datasets + computational methods/models + in urban contexts
to explore the dynamics of cities and urban challenges.
A branch of Computational Social Science and related to Urban Science
Integrate many disciplines like sociology, geography, economics, urban planning, computer science, mathematics and many more.
Interested on the “why” rather than the “what” of urban phenomena.
Urban areas are the epicenter of human activity. Living in cities has many advantages, but also many challenges.
People
Resources
Resources
Social
Distribution of social interconnectivity, Productivity and Innovation in small vs large cities. From [1]
But cities are also home to most critical problems in our society
Carbon Footprint and population of different cities. From [2]
But cities are also home to most critical problems in our society
Death rates by leading causes in urban vs rural areas. From [3]
But cities are also home to most critical problems in our society
Extracted from [4]
Cities are also interesting to study because these challenges are pervasive across nations, transcending cultural, economic and political boundaries.
For example, income inequality is rising in cities across the world, from the US to China to Brazil.
Extracted from [4]
Spatial (residential) segregation happens in all major cities.
This universality across time, geography and cultures makes cities a great laboratory to study human behavior and urban phenomena.
Solutions found in a particular city can be applied to other cities, making computationa urban science a powerful tool for policy making and urban planning.
By studying them we also gain a better understanding of most of our society.
Understanding urban challenge requires good understanding of human behavior.
Traditional methods (surveys, interviews) to do that are:
More importantly the behavior extracted from traditional methods is based on two key ideas:
Incredibly those two ideas are the basis for most of the urban planning and policy making.
However, the explosion of data from smartphones, sensors, and online platforms has transformed our ability to study human behavior and its intertwined dependence with urban areas beyond our residence.
This complemented with advances in machine learning, causal inference, computer vision, and in data science, has opened new opportunities to study urban phenomena.
This data is:
But it also has many challenges.
Mobile phone data: Call Detail Records (CDRs) provide information on the location and communication patterns of millions of people. Location Based Services (LBS) from apps provide more detailed information on mobility patterns through GPS geolocation.
Social media data: Twitter, Facebook, Instagram, and other platforms provide information on social interactions, opinions, and activities but can also geolocalize those activities.
Transaction data: Credit card transactions, e-commerce, and other financial data provide information on consumption patterns and economic activity within urban areas.
Sensors: IoT devices, cameras, and other sensors provide information on environmental conditions, traffic, and other urban phenomena.
Satellite data: Remote sensing data provides information on urban use, vegetation, and other environmental factors.
Open data: Data from government agencies, companies, and other organizations provide information on urban infrastructure, services, and other aspects of urban life.
Sources of large-scale data have different temporal and spatial resolutions, and different levels of detail and coverage
Feature | Census | Social Media | Bank | CDR | GPS | Sensors & Cameras |
---|---|---|---|---|---|---|
High Accuracy | ✔️ | ✔️ | ✔️ | ✔️ | ||
Availability | ✔️ | ✔️[?] | ️ | |||
Pop. Coverage | ✔️ | ✔️ | ✔️ | |||
Real-time | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | |
Cost | ✔️ | ✔️ | ✔️ | ✔️ | ||
Privacy | ✔️ |
Sources of geolocalized activity from mobile phones
Example, Location Base Services (LBS) data.
Transactional data (credit card transactions, transport cards) is another important source of data for studying human behavior in urban areas.
It provides information on consumption patterns, mobility patterns, and other economic activities.
Example: Urban lifestyles extracted from credit card transactions. From [5]
Urban consumption patterns before, during, and after COVID-19 lockdowns in Mexico. From Moro et al
Example: using geographical mobility, activity, and content analysis from Twitter to understand patterns of unemployment in Spain [6]
Geographical Mobility in Spain using tweets. From [6]
Advances in satellite technology and aerial and street imaging have provided new opportunities to study urban areas.
For example, Low-Earth orbit (LEO) constellations capture detailed (up to 1-meter resolution) and frequent (daily/hourly intervals) urban environments.
Most of the satellite high-resolution imagery is commercial, but some of them are availble for scientific use like
European’s Union Sentinel Browser
Example: Wealth estimation from satellite imagery in African cities. From [7]
Wealth estimation from satellite imagery in Nigeria cities. From [7]
Example: change in industrial and economic areas, nights at light to estimate GDP. Offical GDP vs estimated.
Change in industrial and economic areas, nights at light to estimate GPD. From The Economist
Advances in computer vision and machine learning have allowed the automatic extraction of information from street imagery. That allows tranforming street imagery or video into structured data about urban areas and period’s behavior.
AI Visual recognition of a street image
For example, large-scale datasets from Google Street View have been used to estimate hidden neighborhood characteristics like socioeconomic status, crime rates, or even political preferences [8] [9]
Finally, the rise of sensors and the “smart city” revolution has provided new sources of continuous data streams about city operations and resident activities. Embedded in infrastructure like traffic lights, public transportation systems, waste management facilities, and utility grids, these sensors provide real-time information about urban phenomena.
Apart from the Census, there are many open datasets that can be used to study urban phenomena. In particular, many city authorities, and government agencies provide open data about urban infrastructure, services, and other aspects of urban life.
For example, the Open Data Portal of New York City or the Analyze Boston Portal provide data on a wide range of topics, from crime rates to building permits to restaurant inspections.
Canopy Change Assesment Project in the Boston Open data Portal
At the same time large-datasets become available advances in AI and machine learning have revolutionized the way we analyze and extract information from those datasets.
Traditional methods fall short to handle the scale, variety, and dynamic of large-scale datasets. We need advance methods to extract information from them.
For example
Spatial or geographical information is “different”. Its structure is different than tabular data or imagery datasets. In general spatial data is characterized by:
Thus, algorithms need to be adapted to handle that data structure, like for example spatial regressions, spatial clustering, spatial machine learning, spatial causal inference or spatial networks. Failure to use those methods can lead to biased results.
Spatial Network of frienships between cities
Why large-scale data has revolutionize Urban Science?
Average distance travelled by residents in NYC before and after social distance measures. From mobile phone mobility data
Change in visits to Fast Food Outlets after changing workplaces and their food environments
However, large-scale data also has many challenges:
Important
How can we adress these problems:
Urban Science is the study of cities and urban areas, focusing on their structure, dynamics, and challenges.
Integrates disciplines like sociology, geography, economics, urban planning, and computer science.
Computational Urban Science can be viewed as an extension of Urban Science, studying old problems with:
But the use of large-scale data and computational methods has permitted to study new problems that were not possible before and transformed the way we study old problems
Let’s see some examples:
One crucial characteristic of Computational Urban Science is the different dimensions and nature of the problem under study.
Here is a list of typical problems under study in Urban Science
Note
Most of this problems involve the study of the relationship between human behavior, urban environment and societal outcomes.
But are different in the level of predictability of human behavior.
Hard problems
For example, transportation or environmental sustainability problems can be well-framed using computational optimization tools, atmospheric models, and simultation models because they are based on physical laws and highly predictible human behavior.
In transportation, for example, poeple try to optimize their routes and thus, we can predict the traffic congestion in a city using the number of cars, the road network, and the traffic lights.
Hard problems in urban areas are highly predictable and can be solved using computational models.
Example: traffic simulation using agent-based-simulations in Berlin using the MATSim open source software.
But are different in the level of predictability of human behavior.
Soft problems
But other problems like urban planning, housing, inequality, or governance are based on less predictable and more complex human behavior subject to highly variable social, economic, and political factors.
Even the same problem can have both hard and soft components. For example, predicting what individuals are going to do at lunch time is very easy (hard problem), but the place they choose might be more difficult to predict.
Large-scale data is changing also the nature of some problems. For example, decades ago, it was very hard to predict store patronage, the segregation of people in their daily moves, the risk of getting infected by attending an event, or the impact of transportation policies on business
Computational Urban Science is transforming these problems from descriptive observations to predictive and prescriptive models.
Detecting income segregation at individual places in urban areas [20]
Another characteristic of Computational Urban Science is the study of cities as complex systems, challenging the prevailing urban science paradigm that the “City is a Tree” [21]
Traditional (solid) versus current (solid + dashed) view of urban challenges
This view is being challenged by the “City is a Network” paradigm (solid + dashed lines) specially by the works of Jane Jacobs [22], Christopher Alexander [21] and more recently Mike Batty [23].
Traditional (solid) versus current (solid + dashed) view of urban challenges
For example. Here is the manually annotated causal map of how environmental health socio and economic systems were intertwined during the COVID-19 pandemic
A preliminary causal diagram demonstrating the complexity of the COVID-19 pandemic environmental–health–socio–economic system. From
More granular data about people’s behavior has permitted establishing a set of scientific rules of laws that define the core principles of human behavior in cities.
Before Computational Urban Science, some of these principles were known but not quantified or generalized. For example, here are the “Three Laws of Geography” [24] [25]
or the “Law of Gravity” [26] of aggregated flow of movements \(T_{ij}\) between places \(i\) and \(j\) \[ T_{ij} = \frac{P_i O_j}{d_{ij}^\beta} \] where \(O_j\) is the attractiveness of area \(j\) and \(P_i\) is the population of place \(i\).
Computational Urban Science has permitted to quantify and generalize these laws, and even to find new ones. For example, here are some of the most recent ones:
Universal distance-frequency distribution of population flows, from [28]
Computational Urban Science has permitted to quantify and generalize these laws, and even to find new ones. For example, here are some of the most recent ones:
Universal distance-frequency distribution of population flows, from [28]
Computational Urban Science has permitted to quantify and generalize these laws, and even to find new ones. For example, here are some of the most recent ones:
Scaling of urban infrastructure and socio-economic output, from [29]
Computational Urban Science has allowed also to understand the emergence of those laws by finding the microscopic rules that generate them.
For example, the radiation model can be derived from the concept of “intervening opportunities” and the “universal vistitation law” from the concept of “preferential attachment” in the Exploration and Preferential mobility models.
This “emergence” of macroscopic universal laws from very simple microscopic description of human behavior might explain whay those laws are universal across different geographical, cultural, and societal context.
This is also at the core of Computational Urban Science: how local interactions between individuals, environments, and societal outcomes create universal laws to understand, predict, and address urban challenges globally.
In summary, we can define Computational Urban Science as:
More reading:
CUS 2025, ©SUNLab group socialurban.net/CUS
Social media data to understand human behavior in cities
Social media data is/was another way to understand human behavior in urban areas.
Apart from geolocalization, it also provider information about social interactions, opinions, and activities.
However, since 2015 and for privacy reasons, many social media platforms started to deprecate the use of geolocation of activities in their APIs.
Still some social media platforms provide some geographical information at aggregated (zip code) level.