搜索 — ResearchTracker

LLM-as-a-Judge has become the dominant evaluation paradigm for many natural language generation tasks, due to shortcomings of conventional metrics and high correlations with human judgment, albeit mostly in English. There are now attempts to extend LLM-as-a-Judge to multilingual settings including low-resource languages. However, LLMs have limited proficiency in low-resource languages, and there is often no adequate human validation in these settings. To highlight the scope of the problem and current practices, we explore the use of LLM-as-a-Judge evaluators in ACL Anthology papers focusing on multilingual settings and low-resource languages across a diverse set of tasks. Out of 650 papers mentioning LLM-as-a-judge, only 33 of them focus on low-resource or multilingual settings. Our in-depth analysis of these papers indicates inconsistent evaluation outcomes, a tendency to overtrust LLM judgments in multilingual settings, and the widespread reliance on a single judge model per study. To help the NLP community further, we conclude with recommendations about how to use LLM-as-a-Judge in multilingual and low-resource settings.

Environmental (in)considerations in the Design of Smartphone Settings

arXiv2025-07-25作者：Thomas Thibault, Léa Mosesso, Camille Adam

Designing for sufficiency is one of many approaches that could foster more moderate and sustainable digital practices. Based on the Sustainable Information and Communication Technologies (ICT) and Human-Computer Interaction (HCI) literature, we identify five environmental settings categories. However, our analysis of three mobile OS and nine representative applications shows an overall lack of environmental concerns in settings design, leading us to identify six pervasive anti-patterns. Environmental settings, where they exist, are set on the most intensive option by default. They are not presented as such, are not easily accessible, and offer little explanation of their impact. Instead, they encourage more intensive use. Based on these findings, we create a design workbook that explores design principles for environmental settings: presenting the environmental potential of settings; shifting to environmentally neutral states; previewing effects to encourage moderate use; rethinking defaults; facilitating settings access and; exploring more frugal settings. Building upon this workbook, we discuss how settings can tie individual behaviors to systemic factors.

搜索结果：Settings

Challenges and Recommendations for LLMs-as-a-Judge in Multilingual Settings and Low-Resource Languages

Environmental (in)considerations in the Design of Smartphone Settings

Model Restrictiveness in Functional and Structural Settings

Optimized Deferral for Imbalanced Settings

Exploring Child-Robot Interaction in Individual and Group settings in India

MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

From Development to Deployment of AI-assisted Telehealth and Screening for Vision- and Hearing-threatening diseases in resource-constrained settings: Field Observations, Challenges and Way Forward

Puzzlegram: a Serious Game Designed for the Elderly in Group Settings

Advancing Enterprise Spatio-Temporal Forecasting Applications: Data Mining Meets Instruction Tuning of Language Models For Multi-modal Time Series Analysis in Low-Resource Settings

Skein Categories in Non-semisimple Settings

Automated Identification of Security-Relevant Configuration Settings Using NLP

Convergence of Empirical Optimal Transport in Unbounded Settings

Scaling Out-of-Distribution Detection for Real-World Settings

Learning to Advise Humans in High-Stakes Settings

Understanding User Awareness and Behaviors Concerning Encrypted DNS Settings

Bell inequalities with retarded settings

FOCUS: Familiar Objects in Common and Uncommon Settings

Partial Identification of Causal Effects that Vary by Setting

Benchmarking of algorithms for set partitions

Graph Set Transformer