搜索 — ResearchTracker

Background: Open source software (OSS) libraries are critical components of modern software systems, yet their metadata-particularly links to source code repositories and donation platforms-is often incomplete, outdated, or inconsistent. Such deficiencies hinder dependency monitoring, security assessment, and the sustainability of OSS projects. Aims: This study aims to explain notable metadata practices in PyPI libraries, focusing on platform dominance, outdated links, and missing references to repositories and donation platforms. As this investigation relies on large-scale qualitative survey data, we further evaluate the robustness and quality of the LLM-based topic modeling approach used to derive the findings. Method: We conducted two surveys targeting PyPI authors and maintainers, collecting 1,776 open-ended responses. To analyze these responses, we developed a LLM-based topic modeling pipeline using LLaMA 3.3 70B, including preprocessing, topic extraction, and topic merging. Robustness was assessed across 30 repeated runs using Jaccard and cosine similarity, while topic quality was evaluated by 23 experts using a structured assessment framework and Randolph's Kappa. Results: T

TiEBe: Tracking Language Model Recall of Notable Worldwide Events Through Time

arXiv2025-01-13作者：Thales Sales Almeida, Giovana Kerche Bonás, João Guilherme Alves Santos

As the knowledge landscape evolves and large language models (LLMs) become increasingly widespread, there is a growing need to keep these models updated with current events. While existing benchmarks assess general factual recall, few studies explore how LLMs retain knowledge over time or across different regions. To address these gaps, we present the Timely Events Benchmark (TiEBe), a dataset of over 23,000 question-answer pairs centered on notable global and regional events, spanning more than 10 years of events, 23 regions, and 13 languages. TiEBe leverages structured retrospective data from Wikipedia to identify notable events through time. These events are then used to construct a benchmark to evaluate LLMs' understanding of global and regional developments, grounded in factual evidence beyond Wikipedia itself. Our results reveal significant geographic disparities in factual recall, emphasizing the need for more balanced global representation in LLM training. We also observe a Pearson correlation of more than 0.7 between models' performance in TiEBe and various countries' socioeconomic indicators, such as HDI. In addition, we examine the impact of language on factual recall by

搜索结果：Notable

Investigating Notable Metadata Practices in PyPI Libraries: An Empirical Study about Repository and Donation Platform URLs

TiEBe: Tracking Language Model Recall of Notable Worldwide Events Through Time

Procurement without Priors: A Simple Mechanism and its Notable Performance

NOTABLE: Transferable Backdoor Attacks Against Prompt-based NLP Models

NOTE: Notable generation Of patient Text summaries through Efficient approach based on direct preference optimization

Cross-Disciplinary Perspectives on Youth Digital Well-Being Research: Identifying Notable Developments, Persistent Gaps, and Future Directions

Notable: On-the-fly Assistant for Data Storytelling in Computational Notebooks

Evaluating LLMs for Gender Disparities in Notable Persons

Survival of the Notable: Gender Asymmetry in Wikipedia Collective Deliberations

Discovery of novel antimicrobial peptides with notable antibacterial potency by a LLM-based foundation model

Identifying Notable News Stories

Notable Characteristics Search through Knowledge Graphs

Following the footsteps of giants: Modeling the mobility of historically notable individuals using Wikipedia

GENJI Programme: Gamma-ray Emitting Notable AGN Monitoring by Japanese VLBI

Public Domain Rank: Identifying Notable Individuals with the Wisdom of the Crowd

A near-IR spectroscopic search for very-low-mass cool companions to notable DA white dwarfs

Cyber Attacks and Public Embarrassment: A Survey of Some Notable Hacks

Algorithmic Trust and Compliance: Benchmarking Brand Notability for UK iGaming Entities in Generative Search Engines

Nonlinear model reduction for operator learning

Barcoding Invariants and Their Equivalent Discriminating Power