Human mobility data has opened a new frontier for understanding cities, economies, and public health. But these datasets also come with critical methodological challenges: sampling bias, lack of standardization, and privacy risks. Our research develops the methodological foundations needed to make mobility science reliable, reproducible, and socially responsible.
First, we study and correct systematic biases in mobility data. Location datasets are often collected through mobile apps or commercial platforms, which means they can over-represent certain demographic groups or urban areas. By measuring these biases and developing correction methods, we improve the validity and demographic representativeness of mobility-based research [1].
Second, we work to create open and standardized mobility datasets that enable reproducible science and fair comparison between models. Current mobility studies often rely on proprietary pipelines and inconsistent preprocessing steps, making it difficult to replicate results. Our work proposes benchmark datasets and standardized data production pipelines so that researchers can evaluate models under comparable conditions [2].
Third, we develop privacy-preserving synthetic mobility datasets. Real location traces contain sensitive information about individuals and cannot easily be shared. Using machine learning models such as recurrent neural networks, we generate realistic synthetic mobility trajectories that preserve statistical properties of real data while protecting individual privacy [3].
Together, these efforts strengthen the scientific foundations of urban data science. By improving data quality, ensuring demographic fairness, and protecting privacy, our work enables researchers and policymakers to study human behavior in cities with greater accuracy and responsibility.
References
Sanchez, S. A., Gibbs, H., Yabe, T., O’Brien, D. T., & Moro, E. (2026). Correcting temporal bias in mobility data using time-use surveys. arXiv.
Yabe, T., Luca, M., Tsubouchi, K., Lepri, B., Gonzalez, M. C., & Moro, E. (2024). Enhancing human mobility research with open and standardized datasets. Nature Computational Science, 4, 469–472.
Berke, A., Doorley, R., Larson, K., & Moro, E. (2022). Generating synthetic mobility data for a realistic population with RNNs to improve utility and privacy. arXiv.