Elsevier

Science & Justice

Volume 60, Issue 2, March 2020, Pages 180-190
Science & Justice

Establishing phone-pair co-usage by comparing mobility patterns

https://doi.org/10.1016/j.scijus.2019.10.005Get rights and content

Highlights

  • Cell phone users can be identified by comparing mobility patterns from usage data.

  • Call Detail Records, widely used for intelligence, are also forensic evidence.

  • Our results can be used in court when precautions are taken into account.

  • Data quality has more impact on performance than sophisticated models and features.

Abstract

In forensic investigations it is often of value to establish whether two phones were used by the same person during a given time period. We present a method that uses time and location of cell tower registrations of mobile phones to assess the strength of evidence that any pair of phones were used by the same person. The method is transparent as it uses logistic regression to discriminate between the hypotheses of same and different user, and a standard kernel density estimation to quantify the weight of evidence in terms of a likelihood ratio. We further add to previous theoretical work by training and validating our method on real world data, paving the way for application in practice. The method shows good performance under different modeling choices and robustness under lower quantity or quality of data. We discuss practical usage in court.

Introduction

As mobile phones have become established in everyday communication, so has their use in criminal activity. Criminals often use cheap, prepaid “burner phones” to communicate about criminal activities. These burner phones are used for some time before being discarded. Users of burner phones often carry multiple phones at the same time, including a legitimate, “personal” mobile phone to contact family and friends.

Identifying the user of a burner phone can be of great value in tying criminal activities to suspects. One type of traces that may reveal the user’s identity is location traces, as derived from call detail records (CDRs), by comparing the location data from the burner phone to location data from the personal phone of a suspect. CDRs are stored by telecom providers and can typically be requested by police for a period of up to several months in the past, even if the phone was never found physically.

Traditionally, forensic analysis of location data is centered around key events, and focused on few individual records from CDR files. However, CDRs suffer from low accuracy when it comes to pinpointing the position of a cell phone user. While individual records may show whether or not a cell phone is within a certain area, they are less suitable to determine whether the phone is with the same user as another phone. On the other hand, when CDRs are available for an extended period of time, there is no need to restrict this comparison to one or several records. A series of CDRs of two phones can provide statistical evidence that the two phones were carried by the same user.

In this paper, we propose a method of comparing mobility patterns from CDRs of two phones, using logistic regression. The outcome of this process is a similarity score which indicates whether the phones traveled with the same or with a different user. To attain usability in court, we use this similarity score to calculate evidential strength in terms of a likelihood ratio. The likelihood ratio expresses how probable the similarity score is in the scenario that both phones belong to the same user, relative to the scenario that both phones belong to different users. We performed field experiments to obtain the reference set needed for this approach. Furthermore, we tested the method’s performance on data from real phone usage, which was not used to train the model. Finally, we evaluated the robustness of this method by evaluating performance under a range of different scenarios and method perturbations, including a comparison to a simpler approach based on the notion of dislocations.

Section snippets

Background and related work

When a mobile phone communicates, be it for calling, texting or data traffic, it connects to a cell tower. The timestamp of this event, as well as the identification number (cell id) of the tower are recorded by the telecom provider in call detail records. The location of this cell tower gives an approximation of the location of the mobile phone user, because a mobile phone will preferentially connect to the cell tower that it has the best connection with in terms of signal strength. This means

Materials and methods

We will assume that two mobile phones m1,m2 that have the same user will be in the same place at all times. The call detail records we use to determine the location of a mobile phone are only generated at certain events, such as the initiation of a phone call. This means we will have a limited number of observations for each phone during a period of comparison, where an observation consists of a time t and the location of the cell tower x. The data for each phone mi thus consist of a vector of n

Analyzing the method

We refer to the method as described in Sections 3.2 Scoring adjacent measurement pairs, 3.3 Scoring track pairs, 3.4 Computing the likelihood ratio as the baseline method. We performed a range of sensitivity analyses to assess how performance of this baseline method alters when:

  • 1.

    a simpler approach than the baseline two-step method is used (Section 4.1)

  • 2.

    other classifiers than logistic regression are used (Section 4.2)

  • 3.

    other calibration methods are used (Section 4.3)

  • 4.

    the percentage of train data used

Results

The 18 independent field experiments resulted in 273 same-user track pairs and 428 different-user track pairs, with a total of 3847 and 6044 switches, respectively. The real user data resulted in a validation dataset of 28 same-user and 300 different-user track pairs, with a total of 844 and 4750 switches, respectively.

Fig. 6 gives an overview of the distribution of features. As expected, for the reference phones the time difference distribution is nearly identical for the two hypotheses, as

Discussion

We presented a novel method for evaluating the strength of evidence from call detail records that any pair of phones were carried by the same person. The method produces a score for pairs of registrations, which lead to a score for any pair of phones, for a given period. A calibration step follows, in which the score is converted into a likelihood ratio (LR) between both hypotheses. Using data from experiments and real phone usage we assessed performance of the method. We further assessed the

Acknowledgements

We are grateful to anyone who supported this project and helped getting high quality data, including volunteers who have shared their cell phone data, or participated in the field experiments. We further thank Cor Veenman, Jeanette Leegwater and the reviewers for helpful suggestions and critical reading.

References (18)

There are more references available in the full text version of this article.

Cited by (0)

View full text