We report results of the CASE 2022 Shared Task 1 on Multilingual Protest Event Detection. This task is a continuation of CASE 2021 that consists of four subtasks that are i) document classification, ii) sentence classification, iii) event sentence coreference identification, and iv) event extraction. The CASE 2022 extension consists of expanding the test data with more data in previously available languages, namely, English, Hindi, Portuguese, and Spanish, and adding new test data in Mandarin, Turkish, and Urdu for Sub-task 1, document classification. The training data from CASE 2021 in English, Portuguese and Spanish were utilized. Therefore, predicting document labels in Hindi, Mandarin, Turkish, and Urdu occurs in a zero-shot setting. The CASE 2022 workshop accepts reports on systems developed for predicting test data of CASE 2021 as well. We observe that the best systems submitted by CASE 2022 participants achieve between 79.71 and 84.06 F1-macro for new languages in a zero-shot setting. The winning approaches are mainly ensembling models and merging data in multiple languages. The best two submissions on CASE 2021 data outperform submissions from last year for Subtask 1 and Su
We examine how legal infrastructure organizes eviction in Philadelphia. Using 755,004 Philadelphia landlord--tenant court records filed from 1969 to 2022, we show that eviction is concentrated most strongly among plaintiff-side attorneys. In a typical year, the 10 most active plaintiff attorneys, about 3-4% of active plaintiff attorneys, handle 82.0% of represented cases. Filing is also highly routinized. It is largely same-plaintiff filing, concentrated at the same addresses, and reproduced through recurring plaintiff-attorney-property combinations. Eviction, in short, is organized through repeat actors and repeat places. Specialist attorney plaintiff-side counsel changes how cases are handled inside that system. When plaintiffs adopt specialist attorney counsel, filings rise and repeated use of the same addresses increases, although those filing-margin shifts appear to reflect broader reorganization around counsel entry. In stronger within-plaintiff and within-plaintiff-property comparisons, specialist attorney counsel is associated with fewer judgments by agreement, a lower fee share, and much less lockout-trigger language, with weaker evidence for default and downstream enforce
Reverse engineering (RE) of x86 binaries is indispensable for malware and firmware analysis, but remains slow due to stripped metadata and adversarial obfuscation. Large Language Models (LLMs) offer potential for improving RE efficiency through automated comprehension and commenting, but cloud-hosted, closed-weight models pose privacy and security risks and cannot be used in closed-network facilities. We evaluate parameter-efficient fine-tuned local LLMs for assisting with x86 RE tasks in these settings. Eight open-weight models across the CodeLlama, Qwen2.5-Coder, and CodeGemma series are fine-tuned on a custom curated dataset of 5,981 x86 assembly examples. We evaluate them quantitatively and identify the fine-tuned Qwen2.5-Coder-7B as the top performer, which we name REx86. REx86 reduces test-set cross-entropy loss by 64.2% and improves semantic cosine similarity against ground truth by 20.3\% over its base model. In a limited user case study (n=43), REx86 significantly enhanced line-level code understanding (p = 0.031) and increased the correct-solve rate from 31% to 53% (p = 0.189), though the latter did not reach statistical significance. Qualitative analysis shows more accur
We construct a large family of conformally covariant tridifferential operators as tangential operators in the Fefferman--Graham ambient space. Our construction is analogous to the linear and bilinear constructions of Graham--Jenne--Mason--Sparling and Case--Lin--Yuan, respectively. We also show that the symmetrization of our ambient operators are formally self-adjoint when acting on densities of the correct weight.
We show that there are infinitely many pairwise nonhomothetic, complete, periodic metrics with constant scalar curvature that are conformal to the round metric on $S^n\setminus S^k$, where $k < \frac{n-2}{2}$. These metrics are obtained by pulling back Yamabe metrics defined on products of $S^{n-k-1}$ and compact hyperbolic $(k+1)$-manifolds. Our main result proves that these solutions are generically distinct up to homothety. The core of our argument relies on classical rigidity theorems due to Obata and Ferrand, which characterize the round sphere by its conformal group.
Although multiple works have proposed energy-efficient resource allocation schemes for Massive Multiple-Input Multiple-Output (M-MIMO) system, most approaches overlook the potential of optimizing Power Amplifier (PA) transmission power while accounting for non-linear distortion effects. Furthermore, most M-MIMO studies assume narrow-band transmission, neglecting subcarrier intermodulations at the non-linear PA for an Orthogonal Frequency Division Multiplexing (OFDM) system. Therefore, this work investigates the energy-efficient power allocation for a single-user equipment (UE) M-MIMO downlink (DL) system employing OFDM with nonlinear PAs. Unlike prior works, we model wide-band transmission using a soft-limiter PA model and derive a closed-form expression for the signal-to-distortion-and-noise ratio (SNDR) under Rayleigh fading and Maximal Ratio Transmission (MRT) precoding. Next, the Energy Efficiency (EE) function is defined considering two PA architectures and a distorted OFDM signal. We then propose a low complexity root-finding algorithm to maximize EE by transmit power adjustment. Simulation results demonstrate significant EE gains over a fixed PA back-off baseline, with over
The aim of the CASE 2021 Shared Task 1 (Hürriyetoğlu et al., 2021) was to detect and classify socio-political and crisis event information at document, sentence, cross-sentence, and token levels in a multilingual setting, with each of these subtasks being evaluated separately in each test language. Our submission contained entries in all of the subtasks, and the scores obtained validated our research finding: That the multilingual aspect of the tasks should be embraced, so that modeling and training regimes use the multilingual nature of the tasks to their mutual benefit, rather than trying to tackle the different languages separately. Our code is available at https://github.com/HandshakesByDC/case2021/
In this note, we show that, despite the widespread assumption, the consistency formula for Peano Arithmetic PA, Con(PA), "for all x, x is not a code of a derivation of (0=1)," is not equivalent in PA to the consistency of PA. Specifically, we demonstrate that "PA is consistent" is provably in PA equivalent to the series ConS(PA) of arithmetical sentences "n is not a code of a derivation of (0=1)" for n=0,1,2,.... Since Con(PA) is strictly stronger in PA than ConS(PA), the unprovability of Con(PA) in PA does not yield the unprovability of PA-consistency.
In this paper, a new Monte Carlo Glauber model is developed for pA collisions. It uses the hadronic cross sections calculated within the KMR model as implemented in the SHRiMPS minimum bias module of the SHERPA event generator. These cross sections are obtained as functions of impact parameter and, therefore, are ready for use in a Glauber model without additional assumptions regarding their impact parameter dependence. We compare the results obtained with those from a Black Disk model and from another model with colour fluctuations. It is shown that the KMR/SHRiMPS cross sections may present very good descriptions of the multiplicity distributions in pA collisions given that they show a long tail in the distribution of wounded nucleons. Moreover, it is shown that they also increase the anisotropy in the spacial distribution of wounded nucleons, which can be important in the description of the initial states of pA collisions. The generalization to A+A collisions is straightforward.
We present the first deep radio continuum observations of Pa 30, a nebula hosting a unique optical source driven by an ultrafast outflow with a velocity of 16,000 km s$^{-1}$. The nebula was proposed to be the remnant of a white dwarf merger that occurred in 1181CE. We report no detection of the radio diffuse emission from Pa 30 or radio emission from the central source, setting $3σ$ upper limit flux densities of $0.84\,\rm mJy$ and $0.29\,\rm mJy$ at 1.5 GHz and 6 GHz, respectively, for Pa 30. The radio surface brightness of Pa 30 is $\sim 3$ orders of magnitude smaller than that of typical supernova remnants (SNRs) with comparable angular size. If Pa 30 is an SNR, our observations show it to be the faintest known in the radio band. Considering that 10\% of the supernova (SN) kinetic energy is transferred to cosmic rays (CRs), the absence of radio synchrotron emission suggests that the SN kinetic energy $\lesssim3\times 10^{47}(B/10 μ\rm G)^{-1.65}$ erg, which is 3 to 4 orders of magnitude lower than that of typical SNRs and the lowest measured among Galactic SNRs. There is also an indication of inefficient CR acceleration for this source. The low SN kinetic energy either implies
In medical imaging, access to data is commonly limited due to patient privacy restrictions and the issue that it can be difficult to acquire enough data in the case of rare diseases.[1] The purpose of this investigation was to develop a reusable open-source synthetic image generation pipeline, the GAN Image Synthesis Tool (GIST), that is easy to use as well as easy to deploy. The pipeline helps to improve and standardize AI algorithms in the digital health space by generating high quality synthetic image data that is not linked to specific patients. Its image generation capabilities include the ability to generate imaging of pathologies or injuries with low incidence rates. This improvement of digital health AI algorithms could improve diagnostic accuracy, aid in patient care, decrease medicolegal claims, and ultimately decrease the overall cost of healthcare. The pipeline builds on existing Generative Adversarial Networks (GANs) algorithms, and preprocessing and evaluation steps were included for completeness. For this work, we focused on ensuring the pipeline supports radiography, with a focus on synthetic knee and elbow x-ray images. In designing the pipeline, we evaluated the p
The lattice problem for models of Peano Arithmetic ($\mathsf{PA}$) is to determine which lattices can be represented as lattices of elementary submodels of a model of $\mathsf{PA}$, or, in greater generality, for a given model $\mathcal{M}$, which lattices can be represented as interstructure lattices of elementary submodels $\mathcal{K}$ of an elementary extension $\mathcal{N}$ such that $\mathcal{M} \preccurlyeq \mathcal{K} \preccurlyeq \mathcal{N}$. The problem has been studied for the last 60 years and the results and their proofs show an interesting interplay between the model theory of PA, Ramsey style combinatorics, lattice representation theory, and elementary number theory. We present a survey of the most important results together with a detailed analysis of some special cases to explain and motivate a technique developed by James Schmerl for constructing elementary extensions with prescribed interstructure lattices. The last section is devoted to a discussion of lesser-known results about lattices of elementary submodels of countable recursively saturated models of PA.
We prove that the curved Ovsienko--Redou operators and a related family of differential operators are formally self-adjoint. This verifies two conjectures of Case, Lin, and Yuan.
The Segment Anything Model (SAM) has exhibited outstanding performance in various image segmentation tasks. Despite being trained with over a billion masks, SAM faces challenges in mask prediction quality in numerous scenarios, especially in real-world contexts. In this paper, we introduce a novel prompt-driven adapter into SAM, namely Prompt Adapter Segment Anything Model (PA-SAM), aiming to enhance the segmentation mask quality of the original SAM. By exclusively training the prompt adapter, PA-SAM extracts detailed information from images and optimizes the mask decoder feature at both sparse and dense prompt levels, improving the segmentation performance of SAM to produce high-quality masks. Experimental results demonstrate that our PA-SAM outperforms other SAM-based methods in high-quality, zero-shot, and open-set segmentation. We're making the source code and models available at https://github.com/xzz2/pa-sam.
We study the azimuthal angular decorrelations of dijet production in both proton-proton (pp) and proton-nucleus (pA) collisions. By utilizing soft-collinear effective theory, we establish the factorization and resummation formalism at the next-to-leading logarithmic accuracy for the azimuthal angular decorrelations in the back-to-back limit in pp collisions. We propose an approach where the nuclear modifications to dijet production in pA collisions are accounted for in the nuclear modified transverse momentum dependent parton distribution functions (nTMDPDFs), which contain both collinear and transverse dynamics. This approach naturally generalizes the well-established formalism related to the nuclear modified collinear parton distribution functions (nPDFs). We demonstrate strong consistency between our methodology and the CMS measurements in both pp and pA collisions, and make predictions for dijet production in the forward rapidity region in pA collisions at LHC kinematics and for mid-rapidity kinematics at sPHENIX. Throughout this paper, we focus on the application of this formalism to a simultaneous fit to both collinear and transverse momentum dependent contributions to the tr
The media's representation of illicit substance use can lead to harmful stereotypes and stigmatization for individuals struggling with addiction, ultimately influencing public perception, policy, and public health outcomes. To explore how the discourse and coverage of illicit drug use changed over time, this study analyzes 157,476 articles published in the Philadelphia Inquirer over a decade. Specifically, the study focuses on articles that mentioned at least one commonly abused substance, resulting in a sample of 3,903 articles. Our analysis shows that cannabis and narcotics are the most frequently discussed classes of drugs. Hallucinogenic drugs are portrayed more positively than other categories, whereas narcotics are portrayed the most negatively. Our research aims to highlight the need for accurate and inclusive portrayals of substance use and addiction in the media.
Massive multiple-input multiple-output (MIMO) precoders are typically designed by minimizing the transmit power subject to a quality-of-service (QoS) constraint. However, current sustainability goals incentivize more energy-efficient solutions and thus it is of paramount importance to minimize the consumed power directly. Minimizing the consumed power of the power amplifier (PA), one of the most consuming components, gives rise to a convex, non-differentiable optimization problem, which has been solved in the past using conventional convex solvers. Additionally, this problem can be solved using a proximal gradient descent (PGD) algorithm, which suffers from slow convergence. In this work, to overcome the slow convergence, a deep unfolded version of the algorithm is proposed, which can achieve close-to-optimal solutions in only 20 iterations compared to the 3500 plus iterations needed by the PGD algorithm. Results indicate that the deep unfolding algorithm is three orders of magnitude faster than a conventional convex solver and four orders of magnitude faster than the PGD.
Naming tests represent an essential tool in gauging the severity of aphasia and monitoring the trajectory of recovery for individuals afflicted with this debilitating condition. In these assessments, patients are presented with images corresponding to common nouns, and their responses are evaluated for accuracy. The Philadelphia Naming Test (PNT) stands as a paragon in this domain, offering nuanced insights into the type of errors made in responses. In a groundbreaking advancement, Walker et al. (2018) introduced a model rooted in Item Response Theory and multinomial processing trees (MPT-IRT). This innovative approach seeks to unravel the intricate mechanisms underlying the various errors patients make when responding to an item, aiming to pinpoint the specific stage of word production where a patient's capability falters. However, given the sophisticated nature of the IRT-MPT model proposed by Walker et al. (2018), it is imperative to scrutinize both its conceptual as well as its statistical validity. Our endeavor here is to closely examine the model's formulation to ensure its parameters are identifiable as a first step in evaluating its validity.
To comprehensively evaluate a public policy intervention, researchers must consider the effects of the policy not just on the implementing region, but also nearby, indirectly-affected regions. For example, an excise tax on sweetened beverages in Philadelphia was shown to not only be associated with a decrease in volume sales of taxed beverages in Philadelphia, but also an increase in sales in bordering counties not subject to the tax. The latter association may be explained by cross-border shopping behaviors of Philadelphia residents and indicate a causal effect of the tax on nearby regions, which may offset the total effect of the intervention. To estimate causal effects in this setting, we extend difference-in-differences methodology to account for such interference between regions and adjust for potential confounding present in quasi-experimental evaluations. Our doubly robust estimators for the average treatment effect on the treated and neighboring control relax standard assumptions on interference and model specification. We apply these methods to evaluate the change in volume sales of taxed beverages in 231 Philadelphia and bordering county stores due to the Philadelphia bev
Philadelphia's problem with high crime rates continues to be exacerbated as Philadelphia's residents, community leaders, and law enforcement officials struggle to address the root causes of the problem and make the city safer for all. In this work, we deeply understand crime in Philadelphia and offer novel insights for crime mitigation within the city. Open source crime data from 2012-2022 was obtained from OpenDataPhilly. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) was used to cluster geographic locations of crimes. Clustering of crimes within each of 21 police districts was performed, and temporal changes in cluster distributions were analyzed to develop a Non-Systemic Index (NSI). Home Owners' Loan Corporation (HOLC) grades were tested for associations with clusters in police districts labeled `systemic.' Crimes within each district were highly clusterable, according to Hopkins' Mean Statistics. NSI proved to be a good measure of differentiating systemic ($<$ 0.06) and non-systemic ($\geq$ 0.06) districts. Two systemic districts, 19 and 25, were found to be significantly correlated with HOLC grade (p $=2.02 \times 10^{-19}$, p $=1.52 \times 10^{-13}$)