Nie jesteś zalogowany | Zaloguj się

Text mining police data, 4 case studies on identifying domestic violence, human trafficking, terrorism and pedophile suspects in an overload of textual information

Prelegent(ci)
Jonas Poelmans
Afiliacja
Katholieke Univ. Leuven, Belgia
Termin
2 marca 2012 14:15
Pokój
p. 5820
Seminarium
Seminarium badawcze Zakładu Logiki: Wnioskowania aproksymacyjne w eksploracji danych

In the first part of this talk we introduce a human-centered process for knowledge discovery from unstructured text that makes use of Formal Concept Analysis and Emergent Self Organizing Maps. The knowledge discovery process is conceptualized and interpreted as successive iterations through the Concept-Knowledge (C-K) theory design square. To illustrate its effectiveness, we report on a real-life case study of using the process at the Amsterdam-Amstelland police in the Netherlands aimed at distilling concepts to identify domestic violence from the unstructured text in actual police reports. The case study allows us to show how the process was not only able to uncover the nature of a phenomenon such as domestic violence, but also enabled analysts to identify many types of anomalies in the practice of policing. We will illustrate how the insights obtained from this exercise resulted in major improvements in the management of domestic violence cases.

 

In the second part of this talk we describe the successful application of our innovative FCA-based semi-automated knowledge discovery in databases approach for extracting and profiling unknown suspects involved in forced prostitution from observational police reports. 700 000 to 2 000 000 women and children are trafficked across international borders each year and the majority of them is forced to work in the sex industry. Police organizations in the Netherlands dispose of a continuously increasing amount of unstructured text reports describing observations made by police officers during their work in the field. Based on guidelines of the Attorney Generals of the Netherlands we defined multiple early warning indicators that were used to index the 266 157 police reports. Using FCA lattices we revealed numerous unknown human trafficking and loverboy suspects. In depth investigation by the police resulted in a confirmation of their involvement in illegal activities resulting in actual arrestments been made. Our human-centered approach was embedded in to operational policing practice and is now successfully used on a daily basis to cope with the vastly growing amount of unstructured information.

 

In the third part of this talk we use Formal Concept Analysis to extract and visualize potential jihadists in the different phases of radicalisation from a large set of reports describing police observations.  The National Police Service Agency of the Netherlands developed a model to classify (potential) jihadists in four sequential phases of radicalism. The goal of the model is to signal the potential jihadist as early as possible to prevent him or her to enter the next phase. This model has up till now, never been used to actively find new subjects. We employ Temporal Concept Analysis to visualize how a possible jihadist radicalizes over time. The combination of these instruments allows for easy decision-making on where and when to act.

 

In the fourth part of this talk we propose a novel KDD methodology based on Temporal Relational Semantic Systems, the main structure in the temporal and relational version of Formal Concept Analysis. Grooming is the process by which pedophiles try to find children on the internet for sex-related purposes. In chat conversations they may try to establish a connection and escalate the conversation towards a physical meeting. Till date no good methods exist for quickly analyzing the contents, evolution over time, the present state and threat level of these chat conversations. For rapidly gaining insight into the topics of chat conversations we combine a linguistic ontology for chat terms with conceptual scaling and represent the dynamics of chats by life tracks in nested line diagrams. To showcase the possibilities of our approach we used chat conversations of a public American organization which actively searches for pedophiles on the internet.