Towards a trustworthy ROXANNE platform

Funded under the Horizon2020 programme supporting ground-breaking research and advancing European excellence, ROXANNE aims to support law enforcement authorities (LEAs) fight crime and terrorism by facilitating the analysis of criminal data. To this end, the ROXANNE platform combines innovative data analysis capabilities, including speech and language technologies, visual analysis and network analysis, to help identify perpetrators. ROXANNE’s innovation lies in the bi-directional interaction between the multimodal technological processes integrated in the platform. The analysis process further benefits from prior knowledge available to investigators for increased accuracy results that contribute to advancing the case. With Artificial Intelligence (Al) at the core of the ROXANNE platform through the underlying algorithmic models, the project team adopted ethics and privacy by design approach in its research work in order to develop an ethically, legally and socially sound final result.

With the project in its third and final year, this blog post refers to some of the actions taken in the project towards the development of a trustworthy ROXANNE platform. Further to abidance to the high research ethics standards applicable in EU projects as presented in this earlier blog post, the ROXANNE team ensured compliance with the different privacy and data protection provisions synthetized in this checklist. Nevertheless, in order to be publicly acceptable, in addition to the respect of fundamental rights and applicable law, the ROXANNE platform must be developed in a manner that is aligned with the values and principles shared in the society. Therefore, in the pursuit of safer societies through ROXANNE technologies, it is important to thoroughly address public concerns related to such technological issues as complexity, opacity, unpredictability, autonomy and bias. In response to these concerns, the researched ROXANNE technologies endeavor, inter alia, to rely on quality testing and validation data, fulfill transparency requirements, incorporate system authentication features, maintain human oversight and accountability and ensure technological robustness. 

Availability of quality training and testing data is essential in research leveraging computer models to produce adequately performing technologies that aim to enhance public safety. Without data, the development of AI is not possible, whereas diverse, representative and fair data is a prerequisite for unbiased results. In order to overcome the multiple legal, ethical and security hurdles related to the use of real data in ROXANNE, early on in the project lifespan, the consortium decided to build a synthetic dataset, called RXSD, inspired by  a real drug case and following a scenario prepared by a project LEA. This in line with the European Data Protection Supervisor’s position on the use of synthetic data in research as a privacy enhancing technology. Besides, as the ROXANNE team is following responsible data management practices and the Findable, Accessible, Interoperable and Reusable (FAIR[1]) principles, the RXSD dataset will be shared through the Zenodo[2] repository with other reputable researchers, contributing to trust and reusability of built dataset. In addition, the project benefited from real, closed case data, shared by one of the project LEA in an anonymized format. This data was used to demonstrate the latest platform capabilities during the second field test held in October 2021 at the Netherlands Forensic Institute. As part of the ROXANNE platform’s continuous testing efforts, project LEAs proceed with in-house appraisal of the platform’s performance using own data on their premises, thus avoiding data sharing hurdles. For example, the BALSAS databases are used by Forensic Centre of Lithuania to test automatic speech recognition integrated in the platform. Nevertheless, data scarcity is an issue for research, and ROXANNE technical partners suggest some ways for tackling it in relation to natural language processing.

As far as possible, beyond the security, confidentiality and intellectual property rights considerations, the ROXANNE team attempts to be as transparent as possible about the work that it carries out. This is attested by the open nature of project events and its constant communication to the public on performed work and upcoming activities. Both ROXANNE field tests were accessible upon registration to all interested participants. Moreover, the public is regularly informed on ROXANNE events and milestones through a variety of means, including the project website, bi-annual newsletters and social media communication. Also, dedicated channels are used for focused engagement with different expert groups, such as representatives of the international law enforcement community, Stakeholder Board and Ethics Oversight Board members, and cluster and sister projects to share knowledge and maximize impact. Finally, the ROXANNE team conveys, in accessible language, the different technological processes and methods behind the ROXANNE platform. Since the platform integrates link prediction as part of its network analysis function, this blog post presents the robustness and performance of different link production algorithms. Automatic speech recognition system is the subject of a different post, explaining its functioning, limitations and prospects, highlighting that accuracy and reliability of results are contingent on a number of factors (e.g. controlled conditions and sufficiently good signal quality).

Another characteristic of trustworthy AI is to secure competent handling of technologies by operators. In ROXANNE, an entire work package is dedicated to designing and providing comprehensive training to end-users in order to ensure that they have the capacity and knowledge required to comprehensively appreciate the functioning and shortcomings of the platform and its constituent tools. This blog post explains the training methodology adopted for an effective use of the ROXANNE platform through individual training modules, presenting the learning goals and techniques, which are considerate of the audience’s background. At the last field test, participants experienced an interactive guidance session for the ROXANNE training environment. In addition to proper platform control, end-user training is a prerequisite for accurate interpretation of results, including of threshold requirements, such as minimum characteristics of input data to mitigate potential risk of false positive or false negative results. Further to the technological operation and performance of an AI system, training represents the momentum to remind end-users about applicable national, regional legal provisions and organizational codes or policies. The ROXANNE platform’s decision-making mechanism reiterates its human-centric operation by requiring critical engagement from operators at key stages of platform use to confirm the necessary and proportional use of technologies. This procedure emphasizes ROXANNE’s assistive character as the platform is neither capable, nor intended to replace investigators in their work. Through its automated analysis capabilities, ROXANNE aims to alleviate staff and time pressure by prioritizing data for human consideration following its multi-modal pre-processing.

As the ROXANNE platform is being fine-tuned, oversight and security are part of the privacy by design approach, entailing user-identification and accountability mechanisms. These features will enable logging of data processing for auditability, integrity, and potentially evidence admissibility purposes. These aspects are the focus of the final year of the platform development and may be presented during the final field test. To find out more about these and other aspects related the ROXANNE platform, do not forget to sign up to the ROXANNE newsletter.

 

 

[1] Findable, Accessible, Interoperable and Reusable as stated in the Final Report and Action Plan from the Commission Expert Group on FAIR data, 2018, https://ec.europa.eu/info/sites/info/files/turning_fair_into_reality_1.pdf

[2] https://zenodo.org/ The OpenAIRE project, in the vanguard of the open access and open data movements in Europe was commissioned by the EC to support their nascent Open Data policy by providing a catch-all repository for EC funded research. CERN, an OpenAIRE partner and pioneer in open source, open access and open data, provided this capability and Zenodo was launched in May 2013.