Simulating human mobility considering the social dimension, too
In the last decade, the availability of large mobility datasets such as Call Detail Records (CDR) [1, 2, 3], traces from GPS devices embedded in smartphones and cars [3], and geo-tagged posts on Location-Based Social Networks (LBSN) [4], allows characterizing human mobility from a statistical and mathematical point of view, uncovering the invisible rules that govern the individuals' displacements.
Many works in the literature exploit such mobility traces to quantify the patterns that characterize the movements of individuals. These studies show the heterogeneous nature of travel patterns, the existence of a power-law distribution in jump lengths, namely the distance between two consecutive spatial points visited by an individual [1, 5], and in the characteristic spatial spread of an individual, referred to as radius of gyration [1].
Humans exhibit a strong tendency to return to locations they visited before [1] and a propensity to be stationary during the night hours while they move preferably at specific times of the day, following a circadian rhythm [2]. The time spent by an individual in a location is not uniformly distributed but follows a power-law distribution [2]. Moreover, sociality shapes the displacements of individuals. About 10-30% of human movements can be explained by social purposes [6].
Mobility datasets are of fundamental importance in a wide range of areas, from tackling the spreading of epidemics [7] to urban planning, traffic forecasting, what-if analysis, location recommendation systems, car sharing, and geo-marketing.
Unfortunately, in addition to being useful in different disciplines of extreme importance, mobility data hides dangers for the users' privacy whose movements are described in the trajectories contained in the datasets. An attacker that knows only a small number of points of a user's trajectory could re-identify the individual or discover sensitive information like the workplace or the home location, putting the user's identity at serious risk (Figure 1). International laws, such as the GDPR, enforce the same protection measures for location data as for personal data. Simply removing personal data from movement trajectories and replacing them with random identifiers does not anonymize the data because the mobility of individuals is highly unique, and it is relatively easy to match trajectories back with personal data. For this reason, mobility datasets cannot be freely shared and should be anonymized to prevent the re-identification of individuals, making mobility data expensive and rarely available publicly for research.
Figure 1: An attacker that knows a part of a user’s trajectory can infer sensitive information that may undermine the individual’s privacy.
A solution to overcome this privacy issue is to use the above descriptions of human mobility patterns to create generative models for human mobility, i.e., algorithms that produce synthetic yet realistic trajectories while protecting the users' privacy. The synthetic trajectories are useful in what-if analysis or training a machine learning model without the need for a real dataset. A significant advantage of using generative models beyond protecting privacy regards the cost and the time spent in the data collection, which is negligible concerning the acquisition of a real dataset (within a few minutes, the model can generate thousands of trajectories). Moreover, using generative algorithms allows the simulation of the mobility for a set of agents in an unseen scenario.
Most generative models focus only on the spatial and temporal dimensions of human mobility.
In this post, we present STS-EPR (Spatial, Temporal, and Social Exploration and Preferential Return), a generative model that includes mechanisms to capture the spatial, temporal, and social aspects of human mobility. STS-EPR couples two state-of-the-art models: GeoSim [8] and Ditras [3].
Our model belongs to the mechanistic generative models for human mobility, i.e., it uses the statistical principles and the known mechanisms that govern human mobility to generate the synthetic trajectories. STS-EPR includes a mechanism that considers the spatial distance between locations as well as the relevance of a location, a temporal mechanism able to capture the tendency of individuals to follow a circadian rhythm [3], and a social mechanism that models the influence of social links to human displacements.
In STS-EPR, a synthetic agent can move through a two-dimensional space partitioned into cells by exploiting two independent mechanisms.
The first is a spatial mechanism in which the synthetic individual determines in a probabilistic way whether to explore a position never visited before or to return to a previously visited location. The second is a social mechanism in which the agent chooses in a probabilistic way whether to select the destination of its next move individually or with the influence of one of its social contacts.
If the agent decides to explore a new location without the influence of his social contacts, it selects the place to move toward using a gravity law, which encourages movements towards a popular place and penalizes displacements between distant places. In the case of a return to a location visited before and without the influence of social contact, the agent selects it with a probability proportional to the number of visits the agent made to that location. Instead, if the social mechanism chosen is the social influence, the agent chooses the location to explore for the first time or to return to (based on the spatial mechanism picked) with probability proportional to the number of visits made by the social contact.
At this point, a question could arise instinctively: how can we establish that a synthetic trajectory is realistic?
The realism of a set of trajectories is assessed by the statistical similarity of their distributions of some well-known patterns of mobility concerning real trajectories exploited as a benchmark. To evaluate the capability of STS-EPR to generate realistic trajectories, we analyzed the statistical similarity of the distribution relative to the mobility patterns of the synthetic trajectories generated in three cities (New York City, London, and Tokyo) with the real ones (Figure 2). Furthermore, the trajectories generated by STS-EPR were also compared with those generated by GeoSim and Ditras.
Figure 2: A spatial representation of the trajectory of a real individual (left) and of a synthetic individual generated by STS-EPR (right). Figures generated with scikit-mobility.
STS-EPR is part of the scikit-mobility python library, and with few lines of code, it is possible to generate a set of synthetic mobility trajectories.
The python code required to start the simulation.
For an exhaustive example of instantiation and generation of synthetic mobility trajectories using the STS-EPR model, please visit https://jovian.ai/giuliano-cornacchia/example-sts-epr.
In conclusion, mobility data are fundamental in several disciplines and scientific areas.
Unfortunately, these datasets cannot be made public for research purposes due to privacy concerns. One way to solve these privacy problems and to be able to use mobility trajectories is to create a generative model that allows the creation of synthetic and realistic trajectories while protecting the privacy of users. We presented our model, STS-EPR, which uses the social, spatial, and temporal dimensions of mobility to generate trajectories.
In addition, a quick and easy way to use the model in a Python environment has been provided.
The preprint of the STS-EPR model article, to appear on Procedia Computer Science 184C (2021) pp. 258-265., can be downloaded at https://data.d4science.net/jNrH
Author:
Giuliano Cornacchia, Ph.D. student in Computer Science from the University of Pisa.
Exploratory:
Sustainable Cities for Citizens
References:
[1] González, M., Hidalgo, C. & Barabási, AL. Understanding individual human mobility patterns. Nature 453, 779–782 (2008). https://doi.org/10.1038/nature06958
[2] Song, C., Koren, T., Wang, P. et al. Modelling the scaling properties of human mobility. Nature Phys 6, 818–823 (2010). https://doi.org/10.1038/nphys1760
[3] Pappalardo, L., Simini, F. Data-driven generation of spatio-temporal routines in human mobility. Data Min Knowl Disc 32, 787–829 (2018). https://doi.org/10.1007/s10618-017-0548-4
[4] Dingqi Yang, Bingqing Qu, Jie Yang, and Philippe Cudre-Mauroux. 2019. Revisiting User Mobility and Social Relationships in LBSNs: A Hypergraph Embedding Approach. In The World Wide Web Conference (WWW '19). Association for Computing Machinery, New York, NY, USA, 2147–2157. DOI:https://doi.org/10.1145/3308558.3313635
[5] Brockmann, D., Hufnagel, L. & Geisel, T. The scaling laws of human travel. Nature 439, 462–465 (2006). https://doi.org/10.1038/nature04292
[6] E. Cho, S. Myers, and J. Leskovec, “Friendship and mobility: User movement in location-based social networks,” pp. 1082–1090, 08 2011.
[7] Nuria Oliver, Bruno Lepri, Harald Sterly, Renaud Lambiotte, Sébastien Deletaille, Marco De Nadai, Emmanuel Letouzé, Albert Ali Salah, Richard Benjamins, Ciro Cattuto, et al.2020. Mobile phone data for informing public health actions across the COVID-19 pandemic life cycle.
[8] J. Toole, C. Herrera-Yague, C. Schneider, and M. C. Gonzalez, “Coupling human mobility and social ties,” Journal of the Royal Society, Interface /th