Establishing phone-pair co-usage by comparing mobility patterns
Introduction
As mobile phones have become established in everyday communication, so has their use in criminal activity. Criminals often use cheap, prepaid “burner phones” to communicate about criminal activities. These burner phones are used for some time before being discarded. Users of burner phones often carry multiple phones at the same time, including a legitimate, “personal” mobile phone to contact family and friends.
Identifying the user of a burner phone can be of great value in tying criminal activities to suspects. One type of traces that may reveal the user’s identity is location traces, as derived from call detail records (CDRs), by comparing the location data from the burner phone to location data from the personal phone of a suspect. CDRs are stored by telecom providers and can typically be requested by police for a period of up to several months in the past, even if the phone was never found physically.
Traditionally, forensic analysis of location data is centered around key events, and focused on few individual records from CDR files. However, CDRs suffer from low accuracy when it comes to pinpointing the position of a cell phone user. While individual records may show whether or not a cell phone is within a certain area, they are less suitable to determine whether the phone is with the same user as another phone. On the other hand, when CDRs are available for an extended period of time, there is no need to restrict this comparison to one or several records. A series of CDRs of two phones can provide statistical evidence that the two phones were carried by the same user.
In this paper, we propose a method of comparing mobility patterns from CDRs of two phones, using logistic regression. The outcome of this process is a similarity score which indicates whether the phones traveled with the same or with a different user. To attain usability in court, we use this similarity score to calculate evidential strength in terms of a likelihood ratio. The likelihood ratio expresses how probable the similarity score is in the scenario that both phones belong to the same user, relative to the scenario that both phones belong to different users. We performed field experiments to obtain the reference set needed for this approach. Furthermore, we tested the method’s performance on data from real phone usage, which was not used to train the model. Finally, we evaluated the robustness of this method by evaluating performance under a range of different scenarios and method perturbations, including a comparison to a simpler approach based on the notion of dislocations.
Section snippets
Background and related work
When a mobile phone communicates, be it for calling, texting or data traffic, it connects to a cell tower. The timestamp of this event, as well as the identification number (cell id) of the tower are recorded by the telecom provider in call detail records. The location of this cell tower gives an approximation of the location of the mobile phone user, because a mobile phone will preferentially connect to the cell tower that it has the best connection with in terms of signal strength. This means
Materials and methods
We will assume that two mobile phones that have the same user will be in the same place at all times. The call detail records we use to determine the location of a mobile phone are only generated at certain events, such as the initiation of a phone call. This means we will have a limited number of observations for each phone during a period of comparison, where an observation consists of a time t and the location of the cell tower x. The data for each phone thus consist of a vector of n
Analyzing the method
We refer to the method as described in Sections 3.2 Scoring adjacent measurement pairs, 3.3 Scoring track pairs, 3.4 Computing the likelihood ratio as the baseline method. We performed a range of sensitivity analyses to assess how performance of this baseline method alters when:
- 1.
a simpler approach than the baseline two-step method is used (Section 4.1)
- 2.
other classifiers than logistic regression are used (Section 4.2)
- 3.
other calibration methods are used (Section 4.3)
- 4.
the percentage of train data used
Results
The 18 independent field experiments resulted in 273 same-user track pairs and 428 different-user track pairs, with a total of 3847 and 6044 switches, respectively. The real user data resulted in a validation dataset of 28 same-user and 300 different-user track pairs, with a total of 844 and 4750 switches, respectively.
Fig. 6 gives an overview of the distribution of features. As expected, for the reference phones the time difference distribution is nearly identical for the two hypotheses, as
Discussion
We presented a novel method for evaluating the strength of evidence from call detail records that any pair of phones were carried by the same person. The method produces a score for pairs of registrations, which lead to a score for any pair of phones, for a given period. A calibration step follows, in which the score is converted into a likelihood ratio (LR) between both hypotheses. Using data from experiments and real phone usage we assessed performance of the method. We further assessed the
Acknowledgements
We are grateful to anyone who supported this project and helped getting high quality data, including volunteers who have shared their cell phone data, or participated in the field experiments. We further thank Cor Veenman, Jeanette Leegwater and the reviewers for helpful suggestions and critical reading.
References (18)
- et al.
When ‘neutral’ evidence still has probative value (with implications from the Barry George case)
Sci. Justice
(2014) - et al.
Numerical likelihood ratios outputted by LR systems are often based on extrapolation: when to stop extrapolating?
Sci. Justice
(2016) - et al.
A guideline for the validation of likelihood ratio methods used for forensic evidence evaluation
Forensic Sci. Int.
(2017) De locatie van een telefoon als bewijsmiddel?
Tijdschrift Praktijkwijzer Strafrecht
(2015)- Y.-A. de Montjoye, C.A. Hidalgo, M. Verleysen, V.D. Blondel, Unique in the crowd: the privacy bounds of human mobility,...
- X. Lu, E. Wetter, N. Bharti, A.J. Tatem, L. Bengtsson, Approaching the limit of predictability in human mobility,...
- et al.
Predicting the next location change and time of change for mobile phone users
- et al.
Mobile social closeness and communication patterns
- et al.
Mobile social group sizes and scaling ratio
AI & Soc.
(2011)