Background: Open source software (OSS) libraries are critical components of modern software systems, yet their metadata-particularly links to source code repositories and donation platforms-is often incomplete, outdated, or inconsistent. Such deficiencies hinder dependency monitoring, security assessment, and the sustainability of OSS projects. Aims: This study aims to explain notable metadata practices in PyPI libraries, focusing on platform dominance, outdated links, and missing references to repositories and donation platforms. As this investigation relies on large-scale qualitative survey data, we further evaluate the robustness and quality of the LLM-based topic modeling approach used to derive the findings. Method: We conducted two surveys targeting PyPI authors and maintainers, collecting 1,776 open-ended responses. To analyze these responses, we developed a LLM-based topic modeling pipeline using LLaMA 3.3 70B, including preprocessing, topic extraction, and topic merging. Robustness was assessed across 30 repeated runs using Jaccard and cosine similarity, while topic quality was evaluated by 23 experts using a structured assessment framework and Randolph's Kappa. Results: T
As the knowledge landscape evolves and large language models (LLMs) become increasingly widespread, there is a growing need to keep these models updated with current events. While existing benchmarks assess general factual recall, few studies explore how LLMs retain knowledge over time or across different regions. To address these gaps, we present the Timely Events Benchmark (TiEBe), a dataset of over 23,000 question-answer pairs centered on notable global and regional events, spanning more than 10 years of events, 23 regions, and 13 languages. TiEBe leverages structured retrospective data from Wikipedia to identify notable events through time. These events are then used to construct a benchmark to evaluate LLMs' understanding of global and regional developments, grounded in factual evidence beyond Wikipedia itself. Our results reveal significant geographic disparities in factual recall, emphasizing the need for more balanced global representation in LLM training. We also observe a Pearson correlation of more than 0.7 between models' performance in TiEBe and various countries' socioeconomic indicators, such as HDI. In addition, we examine the impact of language on factual recall by
How should a buyer design procurement mechanisms when suppliers' costs are unknown, and the buyer does not have a prior belief? We demonstrate that simple mechanisms - that share a constant fraction of the buyer utility with the seller - allow the buyer to realize a guaranteed positive fraction of the efficient social surplus across all possible costs. Moreover, a judicious choice of the share based on the known demand maximizes the surplus ratio guarantee that can be attained across all possible (arbitrarily complex and nonlinear) mechanisms and cost functions. Similar results hold in related nonlinear pricing and optimal regulation problems.
Prompt-based learning is vulnerable to backdoor attacks. Existing backdoor attacks against prompt-based models consider injecting backdoors into the entire embedding layers or word embedding vectors. Such attacks can be easily affected by retraining on downstream tasks and with different prompting strategies, limiting the transferability of backdoor attacks. In this work, we propose transferable backdoor attacks against prompt-based models, called NOTABLE, which is independent of downstream tasks and prompting strategies. Specifically, NOTABLE injects backdoors into the encoders of PLMs by utilizing an adaptive verbalizer to bind triggers to specific words (i.e., anchors). It activates the backdoor by pasting input with triggers to reach adversary-desired anchors, achieving independence from downstream tasks and prompting strategies. We conduct experiments on six NLP tasks, three popular models, and three prompting strategies. Empirical results show that NOTABLE achieves superior attack performance (i.e., attack success rate over 90% on all the datasets), and outperforms two state-of-the-art baselines. Evaluations on three defenses show the robustness of NOTABLE. Our code can be fo
The discharge summary is a one of critical documents in the patient journey, encompassing all events experienced during hospitalization, including multiple visits, medications, tests, surgery/procedures, and admissions/discharge. Providing a summary of the patient's progress is crucial, as it significantly influences future care and planning. Consequently, clinicians face the laborious and resource-intensive task of manually collecting, organizing, and combining all the necessary data for a discharge summary. Therefore, we propose "NOTE", which stands for "Notable generation Of patient Text summaries through an Efficient approach based on direct preference optimization". NOTE is based on Medical Information Mart for Intensive Care- III dataset and summarizes a single hospitalization of a patient. Patient events are sequentially combined and used to generate a discharge summary for each hospitalization. In the present circumstances, large language models' application programming interfaces (LLMs' APIs) are widely available, but importing and exporting medical data presents significant challenges due to privacy protection policies in healthcare institutions. Moreover, to ensure optim
This paper provides a broad, multi-disciplinary overview of key insights, persistent gaps, and future paths in youth digital well-being research from the perspectives of researchers who are conducting this work.
Computational notebooks are widely used for data analysis. Their interleaved displays of code and execution results (e.g., visualizations) are welcomed since they enable iterative analysis and preserve the exploration process. However, the communication of data findings remains challenging in computational notebooks. Users have to carefully identify useful findings from useless ones, document them with texts and visual embellishments, and then organize them in different tools. Such workflow greatly increases their workload, according to our interviews with practitioners. To address the challenge, we designed Notable to offer on-the-fly assistance for data storytelling in computational notebooks. It provides intelligent support to minimize the work of documenting and organizing data findings and diminishes the cost of switching between data exploration and storytelling. To evaluate Notable, we conducted a user study with 12 data workers. The feedback from user study participants verifies its effectiveness and usability.
This study examines the use of Large Language Models (LLMs) for retrieving factual information, addressing concerns over their propensity to produce factually incorrect "hallucinated" responses or to altogether decline to even answer prompt at all. Specifically, it investigates the presence of gender-based biases in LLMs' responses to factual inquiries. This paper takes a multi-pronged approach to evaluating GPT models by evaluating fairness across multiple dimensions of recall, hallucinations and declinations. Our findings reveal discernible gender disparities in the responses generated by GPT-3.5. While advancements in GPT-4 have led to improvements in performance, they have not fully eradicated these gender disparities, notably in instances where responses are declined. The study further explores the origins of these disparities by examining the influence of gender associations in prompts and the homogeneity in the responses.
Communities on the web rely on open conversation forums for a number of tasks, including governance, information sharing, and decision making. However these forms of collective deliberation can often result in biased outcomes. A prime example are Articles for Deletion (AfD) discussions on Wikipedia, which allow editors to gauge the notability of existing articles, and that, as prior work has suggested, may play a role in perpetuating the notorious gender gap of Wikipedia. Prior attempts to address this question have been hampered by access to narrow observation windows, reliance on limited subsets of both biographies and editorial outcomes, and by potential confounding factors. To address these limitations, here we adopt a competing risk survival framework to fully situate biographical AfD discussions within the full editorial cycle of Wikipedia content. We find that biographies of women are nominated for deletion faster than those of men, despite editors taking longer to reach a consensus for deletion of women, even after controlling for the size of the discussion. Furthermore, we find that AfDs about historical figures show a strong tendency to result into the redirecting or merg
Large language models (LLMs) have shown remarkable advancements in chemistry and biomedical research, acting as versatile foundation models for various tasks. We introduce AMP-Designer, an LLM-based approach for swiftly designing novel antimicrobial peptides (AMPs) with desired properties. Within 11 days, AMP-Designer achieved the de novo design of 18 AMPs with broad-spectrum activity against Gram-negative bacteria. In vitro validation revealed a 94.4% success rate, with two candidates demonstrating exceptional antibacterial efficacy, minimal hemotoxicity, stability in human plasma, and low potential to induce resistance, as evidenced by significant bacterial load reduction in murine lung infection experiments. The entire process, from design to validation, concluded in 48 days. AMP-Designer excels in creating AMPs targeting specific strains despite limited data availability, with a top candidate displaying a minimum inhibitory concentration of 2.0 μg/ml against Propionibacterium acnes. Integrating advanced machine learning techniques, AMP-Designer demonstrates remarkable efficiency, paving the way for innovative solutions to antibiotic resistance.
The volume of news content has increased significantly in recent years and systems to process and deliver this information in an automated fashion at scale are becoming increasingly prevalent. One critical component that is required in such systems is a method to automatically determine how notable a certain news story is, in order to prioritize these stories during delivery. One way to do so is to compare each story in a stream of news stories to a notable event. In other words, the problem of detecting notable news can be defined as a ranking task; given a trusted source of notable events and a stream of candidate news stories, we aim to answer the question: "Which of the candidate news stories is most similar to the notable one?". We employ different combinations of features and learning to rank (LTR) models and gather relevance labels using crowdsourcing. In our approach, we use structured representations of candidate news stories (triples) and we link them to corresponding entities. Our evaluation shows that the features in our proposed method outperform standard ranking methods, and that the trained model generalizes well to unseen news stories.
Query answering routinely employs knowledge graphs to assist the user in the search process. Given a knowledge graph that represents entities and relationships among them, one aims at complementing the search with intuitive but effective mechanisms. In particular, we focus on the comparison of two or more entities and the detection of unexpected, surprising properties, called notable characteristics. Such characteristics provide intuitive explanations of the peculiarities of the selected entities with respect to similar entities. We propose a solid probabilistic approach that first retrieves entity nodes similar to the query nodes provided by the user, and then exploits distributional properties to understand whether a certain attribute is interesting or not. Our preliminary experiments demonstrate the solidity of our approach and show that we are able to discover notable characteristics that are indeed interesting and relevant for the user.
The steady growth of digitized historical information is continuously stimulating new different approaches to the fields of Digital Humanities and Computational Social Science. In this work, we use Natural Language Processing techniques to retrieve large amounts of historical information from Wikipedia. In particular, the pages of a set of historically notable individuals are processed to catch the locations and the date of people's movements. This information is then structured in a geographical network of mobility patterns. We analyze the mobility of historically notable individuals from different perspectives to better understand the role of migrations and international collaborations in the context of innovation and cultural development. In this work, we first present some general characteristics of the dataset from a social and geographical perspective. Then, we build a spatial network of cities, and we model and quantify the tendency to explore by a set of people that can be considered historically and culturally notable. In this framework, we show that by using a multilevel radiation model for human mobility, we are able to catch important features of migration's behavior. R
We introduce the GENJI program (Gamma-ray Emitting Notable AGN Monitoring by Japanese VLBI), which is a monitoring program of gamma-ray bright AGNs with the VERA array (VLBI Exploration of Radio Astrometry). The GENJI programme aims a dense monitoring at 22 GHz towards the $γ$-ray emitting active galactic nuclei (AGNs) to investigate the radio time variation of the core and possible ejection of new radio component, motion of jets, and their relation with the emission at other wavelengths especially in $γ$-rays. Currently we are monitoring 8 $γ$-ray-emitting notable AGNs (DA 55, 3C 84, M 87, PKS 1510-089, DA 406, NRAO 530, BL Lac, 3C 454.3) about once every two weeks. This programme is promising to trace the trend of radio time variation on shorter timescale than conventional VLBI monitoring programme and to provide complimentary data with them (e.g., MOJAVE, Boston University Blazar Project). In particular, we successfully coordinated quick follow-up observations after the GeV $γ$-ray flare in NRAO 530 and 3C 454.3 reported by the Fermi Gamma-ray Space Telescope. Here we present the initial results of morphology and light curves for the first 7-month operation.
Identifying literary, scientific, and technical works of enduring interest is challenging. Few are able to name significant works across more than a handful of domains or languages. This paper introduces an automatic method for identifying authors of notable works throughout history. Notability is defined using the record of which works volunteers have made available in public domain digital editions. A significant benefit of this bottom-up approach is that it also provides a novel and reproducible index of notability for all individuals with Wikipedia pages. The method promises to supplement the work of cultural organizations and institutions seeking to publicize the availability of notable works and prioritize works for preservation and digitization.
We have undertaken a detailed near-IR spectroscopic analysis of eight notable white dwarfs, predominantly of southern declination. In each case the spectrum failed to reveal compelling evidence for the presence of a spatially unresolved, cool, late-type companion. Therefore, we have placed an approximate limit on the spectral-type of a putative companion to each degenerate. From these limits we conclude that if GD659, GD50, GD71 or WD2359-434 possesses an unresolved companion then most probably it is substellar in nature (M<0.072Msun). Furthermore, any spatially unresolved late-type companion to RE J0457-280, RE J0623-374, RE J0723-274 or RE J2214-491 most likely has M<0.082Msun. These results imply that if weak accretion from a nearby late-type companion is the cause of the unusual photospheric composition observed in a number of these degenerates then the companions are of very low mass, beyond the detection thresholds of this study. Furthermore, these results do not contradict a previously noted deficit of very-low-mass stellar and brown dwarf companions to main sequence F,G,K and early-M type primaries (a<1000AU).
We hear it all too often in the media: an organization is attacked, its data, often containing personally identifying information, is made public, and a hacking group emerges to claim credit. In this excerpt, we discuss how such groups operate and describe the details of a few major cyber-attacks of this sort in the wider context of how they occurred. We feel that understanding how such groups have operated in the past will give organizations ideas of how to defend against them in the future.
The rapid adoption of generative AI-powered search engines, such as ChatGPT, Perplexity, and Gemini, is fundamentally reshaping information retrieval. We are witnessing a critical shift from traditional ranked lists to synthesized, citation-backed answers. This paradigm shift challenges established Search Engine Optimization (SEO) practices and necessitates a new framework, termed Generative Engine Optimization (GEO). In highly regulated environments like the UK iGaming sector, visibility is no longer dictated by keyword density, but by an entity's ability to project "Algorithmic Trust". This report presents an empirical analysis of how compliance signals -- such as UK Gambling Commission (UKGC) standards -- function as authority multipliers for Large Language Models (LLMs) when properly structured. Recent large-scale experiments reveal that AI Search exhibits a systematic and overwhelming bias towards Earned media (third-party, authoritative sources) over Brand-owned content. Consequently, practitioners must engineer their content for machine scannability and justification to dominate these new AI-perceived authority metrics.
Operator learning provides methods to approximate mappings between infinite-dimensional function spaces. Deep operator networks (DeepONets) are a notable architecture in this field. Recently, an extension of DeepONet based on model reduction and neural networks, proper orthogonal decomposition (POD)-DeepONet, has been able to outperform other architectures in terms of accuracy for several benchmark tests. We extend this idea towards nonlinear model order reduction by proposing an efficient framework that combines neural networks with kernel principal component analysis (KPCA) for operator learning. Our results demonstrate the superior performance of KPCA-DeepONet over POD-DeepONet.
The persistence barcode (equivalently, the persistence diagram), which can be obtained from the interval decomposition of a persistence module, plays a pivotal role in applications of persistent homology. For multi-parameter persistent homology, which lacks a complete discrete invariant, and where persistence modules are no longer always interval decomposable, many alternative invariants have been proposed. Many of these invariants are akin to persistence barcodes, in that they assign (possibly signed) multisets of intervals. Furthermore, to any interval decomposable module, those invariants assign the multiset of intervals that correspond to its summands. Naturally, identifying the relationships among invariants of this type, or ordering them by their discriminating power, is a fundamental question. To address this, we formalize the notion of barcoding invariants and compare their discriminating powers. Notably, this formalization enables us to prove that all barcoding invariants with the same basis possess equivalent discriminating power. One implication of our result is that introducing a new barcoding invariant does not add any value in terms of its generic discriminating power