搜索 — ResearchTracker

Large Language Models (LLMs) are versatile and demonstrate impressive generalization ability by mining and learning information from extensive unlabeled text. However, they still exhibit reasoning mistakes, often stemming from knowledge deficiencies, which can affect their trustworthiness and reliability. Although users can provide diverse and comprehensive queries, obtaining sufficient and effective feedback is demanding. Furthermore, evaluating LLMs comprehensively with limited labeled samples is difficult. This makes it a challenge to diagnose and remedy the deficiencies of LLMs through rich label-free user queries. To tackle this challenge, we propose a label-free curricular meaningful learning framework (LaMer). LaMer first employs relative entropy to automatically diagnose and quantify the knowledge deficiencies of LLMs in a label-free setting. Next, to remedy the diagnosed knowledge deficiencies, we apply curricular meaningful learning: first, we adopt meaningful learning to adaptively synthesize augmentation data according to the severity of the deficiencies, and then design a curricular deficiency remedy strategy to remedy the knowledge deficiencies of LLMs progressively.

From Course to Skill: Evaluating LLM Performance in Curricular Analytics

arXiv2025-05-05作者：Zhen Xu, Xinjin Li, Yingqi Huan

Curricular analytics (CA) -- systematic analysis of curricula data to inform program and course refinement -- becomes an increasingly valuable tool to help institutions align academic offerings with evolving societal and economic demands. Large language models (LLMs) are promising for handling large-scale, unstructured curriculum data, but it remains uncertain how reliably LLMs can perform CA tasks. In this paper, we systematically evaluate four text alignment strategies based on LLMs or traditional NLP methods for skill extraction, a core task in CA. Using a stratified sample of 400 curriculum documents of different types and a human-LLM collaborative evaluation framework, we find that retrieval-augmented generation (RAG) is the top-performing strategy across all types of curriculum documents, while zero-shot prompting performs worse than traditional NLP methods in most cases. Our findings highlight the promise of LLMs in analyzing brief and abstract curriculum documents, but also reveal that their performance can vary significantly depending on model selection and prompting strategies. This underscores the importance of carefully evaluating the performance of LLM-based strategies

搜索结果：curricular

Diagnosing and Remedying Knowledge Deficiencies in LLMs via Label-free Curricular Meaningful Learning

From Course to Skill: Evaluating LLM Performance in Curricular Analytics

A Systematic Review on Process Mining for Curricular Analysis

CASE: Efficient Curricular Data Pre-training for Building Assistive Psychology Expert Models

SCANet: Self-Paced Semi-Curricular Attention Network for Non-Homogeneous Image Dehazing

Self-Supervised Curricular Deep Learning for Chest X-Ray Image Classification

Curricular and Cyclical Loss for Time Series Learning Strategy

From Easy to Hard: Learning Curricular Shape-aware Features for Robust Panoptic Scene Graph Generation

Learning Agility and Adaptive Legged Locomotion via Curricular Hindsight Reinforcement Learning

Dual-view Curricular Optimal Transport for Cross-lingual Cross-modal Retrieval

How are Primary School Computer Science Curricular Reforms Contributing to Equity? Impact on Student Learning, Perception of the Discipline, and Gender Gaps

Curricular Contrastive Regularization for Physics-aware Single Image Dehazing

Curricular Object Manipulation in LiDAR-based Object Detection

Evolution With Purpose: Hierarchy-Informed Optimization of Whole-Brain Models

Evaluating 21st-Century Competencies in Postsecondary Curricula with Large Language Models: Performance Benchmarking and Reasoning-Based Prompting Strategies

Redact or Keep? A Fully Local AI Cascade for Educational Dialogue De-Identification

Resource Letter QIE-1: Research in quantum information education

The impact of generative artificial intelligence on academic development of Chinese students in humanities and social sciences

Barriers to Integrating Low-Power IoT in Engineering Education: A Survey of the Literature

Homeostasis Under Technological Transition: How High-Friction Universities Adapt Through Early Filtering Rather Than Reconfiguration