搜索结果：what

共找到 20 条结果

高级筛选 ▾

Knowledge Distillation Must Account for What It Loses

arXiv

This position paper argues that knowledge distillation must account for what it loses: student models should be judged not only by retained task scores, but by whether they preserve the teacher capabilities that make those scores reliable. This matters because distillation is increasingly used to turn large teacher models into deployable students, yet headline metrics can obscure losses in the capabilities that make teacher behavior reliable. Conceptually, we show that current evaluation often assumes retained task scores imply retained teacher capabilities. Reframing distillation as a lossy projection exposes this flaw: students may match selected teacher observables without preserving the capabilities that make them reliable. We then synthesize existing evidence into a taxonomy of off-metric distillation losses, showing that such losses are concrete, recurring, and measurable, yet often unaccounted for when studies report what students retain rather than what they lose. To make the position actionable, we propose scenario-specific preservation targets and a Distillation Loss Statement that reports what was preserved, what was lost, and why the remaining losses are acceptable. The

AI Exposure Scores: what they measure, what they miss, and what comes next

arXiv2026-06-22作者：Campbell Lund, Thomas Euyang, Zanele Munyikwa

A set of exposure scores calculated in 2023 has become a central empirical input to the future of work debate. Produced by Eloundou et al. (2023) and referred to here as the GPTs are GPTs scores, they define exposure as the share of occupational tasks a large language model can assist with. This work is a genuine methodological contribution, but as the scores travel from the time and place they were produced, the limitations the authors named do not always travel with them. Two gaps have widened as a result. The first is structural, between what static exposure scores measure and what policy questions actually require. Taking the diffusion of these scores as a case study, we show how their temporal, geographic, and ontological limitations compound in policy-facing analyses, and we survey five families of research responding to these limits: dynamic and benchmark-based measures, ensemble methods, task-framework extensions, worker-centered metrics, and adoption and usage data. The second gap is the one we argue needs more attention: the coordination between researchers and policymakers. The policy-relevant work which ask who is harmed, who benefits, how, and when, continues to refere

搜索结果：what

Knowledge Distillation Must Account for What It Loses

AI Exposure Scores: what they measure, what they miss, and what comes next

PRAXA: A Grammar for What-If Analysis

What Matters When Cotraining Robot Manipulation Policies on Everyday Human Videos?

What is a word?

What is ontic and what is epistemic in the Quantum Mechanics of Spin?

What Makes a Reward Model a Good Teacher? An Optimization Perspective

A Survey on Responsible Generative AI: What to Generate and What Not

Wolf-Rayet stars -- what we know and what we don't

Proceedings to the 27th Workshop "What Comes Beyond the Standard Models" Bled, July 8-17, 2024

What is isotropic turbulence and why is it important?

What and why of entanglement

What is Entanglement?

Firefly swarms: What models for what physics?

What Is In a Survey? Simulation-Induced Selection Effects in Astronomy

Understanding Physics: 'What?', 'Why?', and 'How?'

What is "fundamental"?

Proceedings to the 25th International Workshop "What Comes Beyond the Standard Models", July 4 -- July 10, 2022, Bled, Slovenia

Ask what's missing and what's useful: Improving Clarification Question Generation using Global Knowledge

What is Quantum Computation?