共找到 20 条结果
The use of Large Language Models (LLMs) in police operations is growing, yet an evaluation framework tailored to police operations remains absent. While LLM's responses may not always be legally incorrect, their unverified use still can lead to severe issues such as unlawful arrests and improper evidence collection. To address this, we propose PAS (Police Action Scenarios), a systematic framework covering the entire evaluation process. Applying this framework, we constructed a novel QA dataset from over 8,000 official documents and established key metrics validated through statistical analysis with police expert judgements. Experimental results show that commercial LLMs struggle with our new police-related tasks, particularly in providing fact-based recommendations. This study highlights the necessity of an expandable evaluation framework to ensure reliable AI-driven police operations. We release our data and prompt template.
We develop and analyze mathematical models for residential burglary that incorporates police deployment through a delayed feedback mechanism. Motivated by empirical observations from publicly available crime and policing data, we extend a well-known agent-based model by introducing a dynamic police response driven by crime information that becomes available only after a finite delay. Taking the mean-field limit, we derive a coupled continuum system consisting of three partial differential equations and one ordinary differential equation describing the interactions among criminal density, environmental attractiveness, delayed crime signal, and police deployment. Linear stability analysis of homogeneous steady states reveals that response delays can destabilize otherwise stable equilibria through Hopf bifurcations. As a result, the model predicts sustained temporal oscillations and dynamically evolving crime hotspots. Numerical simulations of both the agent-based and continuum models confirm the theoretical analysis and uncover rich spatio-temporal behaviors, including moving, splitting, and merging hotspots. Through a parametric study, we investigate the roles of police density, cri
Autocrats use secret police to stay in power, as these organizations deter and suppress opposition to their rule. Existing research shows that secret police succeed at this but, surprisingly, also that they are not as ubiquitous in autocracies as one may assume, existing in fewer than half of autocratic country-years. We thus explore under which conditions secret police emerge in dictatorships. For this purpose, we develop a theoretical framework for potential predictors and apply statistical variable selection techniques to identify which of several candidate variables extracted from the literature on state security forces and authoritarian survival hold explanatory power. Our results highlight that secret police are more likely to emerge when rulers face structural, regime-external threats, such as organised anti-system mobilisation and international rivals, or witness successful regime-internal contestation abroad that hints at similar threats at home. But additionally, we find that rulers must have sufficient material resources and personalised power to establish secret police. This research contributes to our understanding of autocrats' institutional choices and authoritarian
This paper proposes a novel interdisciplinary framework for analyzing police body-worn camera (BWC) footage from the Rochester Police Department (RPD) using advanced artificial intelligence (AI) and statistical machine learning (ML) techniques. Our goal is to detect, classify, and analyze patterns of interaction between police officers and civilians to identify key behavioral dynamics, such as respect, disrespect, escalation, and de-escalation. We apply multimodal data analysis by integrating image, audio, and natural language processing (NLP) techniques to extract meaningful insights from BWC footage. The framework incorporates speaker separation, transcription, and large language models (LLMs) to produce structured, interpretable summaries of police-civilian encounters. We also employ a custom evaluation pipeline to assess transcription quality and behavior detection accuracy in high-stakes, real-world policing scenarios. Our methodology, computational techniques, and findings outline a practical approach for law enforcement review, training, and accountability processes while advancing the frontiers of knowledge discovery from complex police BWC data.
Large-scale policing data is vital for detecting inequity in police behavior and policing algorithms. However, one important type of policing data remains largely unavailable within the United States: aggregated police deployment data capturing which neighborhoods have the heaviest police presences. Here we show that disparities in police deployment levels can be quantified by detecting police vehicles in dashcam images of public street scenes. Using a dataset of 24,803,854 dashcam images from rideshare drivers in New York City, we find that police vehicles can be detected with high accuracy (average precision 0.82, AUC 0.99) and identify 233,596 images which contain police vehicles. There is substantial inequality across neighborhoods in police vehicle deployment levels. The neighborhood with the highest deployment levels has almost 20 times higher levels than the neighborhood with the lowest. Two strikingly different types of areas experience high police vehicle deployments - 1) dense, higher-income, commercial areas and 2) lower-income neighborhoods with higher proportions of Black and Hispanic residents. We discuss the implications of these disparities for policing equity and f
Adaptive traffic signal control (TSC) has demonstrated strong effectiveness in managing dynamic traffic flows. However, conventional methods often struggle when unforeseen traffic incidents occur (e.g., accidents and road maintenance), which typically require labor-intensive and inefficient manual interventions by traffic police officers. Large Language Models (LLMs) appear to be a promising solution thanks to their remarkable reasoning and generalization capabilities. Nevertheless, existing works often propose to replace existing TSC systems with LLM-based systems, which can be (i) unreliable due to the inherent hallucinations of LLMs and (ii) costly due to the need for system replacement. To address the issues of existing works, we propose a hierarchical framework that augments existing TSC systems with LLMs, whereby a virtual traffic police agent at the upper level dynamically fine-tunes selected parameters of signal controllers at the lower level in response to real-time traffic incidents. To enhance domain-specific reliability in response to unforeseen traffic incidents, we devise a self-refined traffic language retrieval system (TLRS), whereby retrieval-augmented generation i
We evaluated one of the most common policing strategies in Brazil: the allocation of police blitzes. This place-based focused deterrence intervention has well-defined assignments, and 3,423 interventions were precisely recorded in Fortaleza-CE, Brazil, between 2012 and 2013. Our analysis takes advantage of the high spatiotemporal daily data resolution coming from an unprecedented longitudinal micro-Big Data (GPS and PING records) to make comparisons of small intervention areas, while controlling for common daily trends, deterrence (spatial and temporal), and diffusion; to show that an average police crackdown causes a 35% decrease in violent crime occurrences. There are diminishing returns of public safety to hours spent by police in a single area, corroborating what police officers know well from their own experience and discretionary behavior. Although crime increases by 6% immediately after the end of a blitz, we observe lasting deterrent effects (diffusion) after 2-3 days. The residual deterrence cancels the relocation of the crime, and the intervention does not generate significant temporal displacement. In addition, we do not find spatial displacement from crime in blocks up
In 2021, the City of Atlanta and Atlanta Police Foundation launched plans to build a large police training facility in the South River Forest in unincorporated DeKalb County, GA. Residents of Atlanta and DeKalb County, environmental activists, police and prison abolitionists, and other activists and concerned individuals formed the movement in opposition to the facility, known as the Stop Cop City / Defend the Atlanta Forest movement. Social media and digital maps became common tools for communicating information about the facility and the movement. Here, we examine online maps about the facility and the opposition movement, originating from grassroots organizations, the City of Atlanta, news media outlets, the Atlanta Police Foundation, and individuals. We gather and examine 32 publicly available maps collected through the Google Search API, Twitter (now X), Instagram and reddit. Using a framework of critical cartography, we conduct a content analysis of these maps to identify the mapping technologies and techniques (data, cartographic elements, styles) used by different stakeholders and roles that maps and mapping technologies can play in social movements. We examine the extent t
Police departments around the world use two-way radio for coordination. These broadcast police communications (BPC) are a unique source of information about everyday police activity and emergency response. Yet BPC are not transcribed, and their naturalistic audio properties make automatic transcription challenging. We collect a corpus of roughly 62,000 manually transcribed radio transmissions (~46 hours of audio) to evaluate the feasibility of automatic speech recognition (ASR) using modern recognition models. We evaluate the performance of off-the-shelf speech recognizers, models fine-tuned on BPC data, and customized end-to-end models. We find that both human and machine transcription is challenging in this domain. Large off-the-shelf ASR models perform poorly, but fine-tuned models can reach the approximate range of human performance. Our work suggests directions for future work, including analysis of short utterances and potential miscommunication in police radio interactions. We make our corpus and data annotation pipeline available to other researchers, to enable further research on recognition and analysis of police communication.
This study provides the first empirical evidence that private donations to police departments can influence officer behavior. Drawing on the psychology of reciprocity bias, we theorize that public donations create social debts that shape discretionary enforcement. Using quasi-experimental data from Chicago, we find that after 7-Eleven sponsored a police foundation gala, investigatory stops, particularly of Black pedestrians, increased around its stores. These findings reveal a racialized pattern of donor bias in policing and call into question the consequences of private donations to public law enforcement.
Face-to-face interactions between police officers and the public affect both individual well-being and democratic legitimacy. Many government-public interactions are captured on video, including interactions between police officers and drivers captured on bodyworn cameras (BWCs). New advances in AI technology enable these interactions to be analyzed at scale, opening promising avenues for improving government transparency and accountability. However, for AI to serve democratic governance effectively, models must be designed to include the preferences and perspectives of the governed. This article proposes a community-informed, approach to developing multi-perspective AI tools for government accountability. We illustrate our approach by describing the research project through which the approach was inductively developed: an effort to build AI tools to analyze BWC footage of traffic stops conducted by the Los Angeles Police Department. We focus on the role of social scientists as members of multidisciplinary teams responsible for integrating the perspectives of diverse stakeholders into the development of AI tools in the domain of police -- and government -- accountability.
Objectives: Compare qualitative coding of instruction tuned large language models (IT-LLMs) against human coders in classifying the presence or absence of vulnerability in routinely collected unstructured text that describes police-public interactions. Evaluate potential bias in IT-LLM codings. Methods: Analyzing publicly available text narratives of police-public interactions recorded by Boston Police Department, we provide humans and IT-LLMs with qualitative labelling codebooks and compare labels generated by both, seeking to identify situations associated with (i) mental ill health; (ii) substance misuse; (iii) alcohol dependence; and (iv) homelessness. We explore multiple prompting strategies and model sizes, and the variability of labels generated by repeated prompts. Additionally, to explore model bias, we utilize counterfactual methods to assess the impact of two protected characteristics - race and gender - on IT-LLM classification. Results: Results demonstrate that IT-LLMs can effectively support human qualitative coding of police incident narratives. While there is some disagreement between LLM and human generated labels, IT-LLMs are highly effective at screening narrativ
Crime situations are race against time. An AI-assisted criminal investigation system, providing prompt but precise legal counsel is in need for police officers. We introduce LAPIS (Language Model Augmented Police Investigation System), an automated system that assists police officers to perform rational and legal investigative actions. We constructed a finetuning dataset and retrieval knowledgebase specialized in crime investigation legal reasoning task. We extended the dataset's quality by incorporating manual curation efforts done by a group of domain experts. We then finetuned the pretrained weights of a smaller Korean language model to the newly constructed dataset and integrated it with the crime investigation knowledgebase retrieval approach. Experimental results show LAPIS' potential in providing reliable legal guidance for police officers, even better than the proprietary GPT-4 model. Qualitative analysis on the rationales generated by LAPIS demonstrate the model's reasoning ability to leverage the premises and derive legally correct conclusions.
In 2020 the tragic murder of George Floyd at the hands of law enforcement ignited and intensified nationwide protests, demanding changes in police funding and allocation. This happened during a budgeting feedback exercise where residents of Austin, Texas were invited to share opinions on the budgets of various city service areas, including the Police Department, on an online platform designed by our team. Daily responses increased by a hundredfold and responses registered after the "exogenous shock" overwhelmingly advocated for reducing police funding. This opinion shift far exceeded what we observed in 14 other Participatory Budgeting elections on our Participatory Budgeting Platform, and can't be explained by shifts in the respondent demographics. Analysis of the results from an Austin budgetary feedback exercise in 2021 and a follow-up survey indicates that the opinion shift from 2020 persisted, with the opinion gap on police funding widening. We conclude that there was an actual change of opinion regarding police funding. This study not only sheds light on the enduring impact of the 2020 events and protests on public opinion, but also showcases the value of analysis of clustere
Achieving a delicate balance between fostering trust in law enforcement and protecting the rights of both officers and civilians continues to emerge as a pressing research and product challenge in the world today. In the pursuit of fairness and transparency, this study presents an innovative AI-driven system designed to generate police report drafts from complex, noisy, and multi-role dialogue data. Our approach intelligently extracts key elements of law enforcement interactions and includes them in the draft, producing structured narratives that are not only high in quality but also reinforce accountability and procedural clarity. This framework holds the potential to transform the reporting process, ensuring greater oversight, consistency, and fairness in future policing practices. A demonstration video of our system can be accessed at https://drive.google.com/file/d/1kBrsGGR8e3B5xPSblrchRGj-Y-kpCHNO/view?usp=sharing
Police incident data is crucial for public security intelligence, yet grassroots agencies struggle with efficient classification due to manual inefficiency and automated system limitations, especially in telecom and online fraud cases. This research proposes a multichannel neural network model, KLCBL, integrating Kolmogorov-Arnold Networks (KAN), a linguistically enhanced text preprocessing approach (LERT), Convolutional Neural Network (CNN), and Bidirectional Long Short-Term Memory (BiLSTM) for police incident classification. Evaluated with real data, KLCBL achieved 91.9% accuracy, outperforming baseline models. The model addresses classification challenges, enhances police informatization, improves resource allocation, and offers broad applicability to other classification tasks.
The goal of this paper is to assess whether there is any correlation between police salaries and crime rates. Using public data sources that contain Baltimore Crime Rates and Baltimore Police Department (BPD) salary information from 2011 to 2021, our research uses a variety of techniques to capture and measure any correlation between the two. Based on that correlation, the paper then uses established social theories to make recommendations on how this data can potentially be used by State Leadership. Our initial results show a negative correlation between salary/compensation levels and crime rates.
We investigate the phenomenon of norm inconsistency: where LLMs apply different norms in similar situations. Specifically, we focus on the high-risk application of deciding whether to call the police in Amazon Ring home surveillance videos. We evaluate the decisions of three state-of-the-art LLMs -- GPT-4, Gemini 1.0, and Claude 3 Sonnet -- in relation to the activities portrayed in the videos, the subjects' skin-tone and gender, and the characteristics of the neighborhoods where the videos were recorded. Our analysis reveals significant norm inconsistencies: (1) a discordance between the recommendation to call the police and the actual presence of criminal activity, and (2) biases influenced by the racial demographics of the neighborhoods. These results highlight the arbitrariness of model decisions in the surveillance context and the limitations of current bias detection and mitigation strategies in normative decision-making.
Police patrol units need to split their time between performing preventive patrol and being dispatched to serve emergency incidents. In the existing literature, patrol and dispatch decisions are often studied separately. We consider joint optimization of these two decisions to improve police operations efficiency and reduce response time to emergency calls. Methodology/results: We propose a novel method for jointly optimizing multi-agent patrol and dispatch to learn policies yielding rapid response times. Our method treats each patroller as an independent Q-learner (agent) with a shared deep Q-network that represents the state-action values. The dispatching decisions are chosen using mixed-integer programming and value function approximation from combinatorial action spaces. We demonstrate that this heterogeneous multi-agent reinforcement learning approach is capable of learning joint policies that outperform those optimized for patrol or dispatch alone. Managerial Implications: Policies jointly optimized for patrol and dispatch can lead to more effective service while targeting demonstrably flexible objectives, such as those encouraging efficiency and equity in response.
Radios are essential for the operations of modern police departments, and they function as both a collaborative communication technology and a sociotechnical system. However, little prior research has examined their usage or their connections to individual privacy and the role of race in policing, two growing topics of concern in the US. As a case study, we examine the Chicago Police Department's (CPD's) use of broadcast police communications (BPC) to coordinate the activity of law enforcement officers (LEOs) in the city. From a recently assembled archive of 80,775 hours of BPC associated with CPD operations, we analyze text transcripts of radio transmissions broadcast 9:00 AM to 5:00 PM on August 10th, 2018 in one majority Black, one majority white, and one majority Hispanic area of the city (24 hours of audio) to explore three research questions: (1) Do BPC reflect reported racial disparities in policing? (2) How and when is gender, race/ethnicity, and age mentioned in BPC? (3) To what extent do BPC include sensitive information, and who is put at most risk by this practice? (4) To what extent can large language models (LLMs) heighten this risk? We explore the vocabulary and spee