搜索 — ResearchTracker

Tasks in the aerospace industry heavily rely on searching and reusing large volumes of technical documents, yet there is no public information retrieval (IR) benchmark that reflects the terminology- and query-intent characteristics of this domain. To address this gap, this paper proposes the STELLA (Self-Reflective TErminoLogy-Aware Framework for BuiLding an Aerospace Information Retrieval Benchmark) framework. Using this framework, we introduce the STELLA benchmark, an aerospace-specific IR evaluation set constructed from NASA Technical Reports Server (NTRS) documents via a systematic pipeline that comprises document layout detection, passage chunking, terminology dictionary construction, synthetic query generation, and cross-lingual extension. The framework generates two types of queries: the Terminology Concordant Query (TCQ), which includes the terminology verbatim to evaluate lexical matching, and the Terminology Agnostic Query (TAQ), which utilizes the terminology's description to assess semantic matching. This enables a disentangled evaluation of the lexical and semantic matching capabilities of embedding models. In addition, we combine Chain-of-Density (CoD) and the Self-Re

TermGPT: Multi-Level Contrastive Fine-Tuning for Terminology Adaptation in Legal and Financial Domain

arXiv2025-11-13作者：Yidan Sun, Mengying Zhu, Feiyue Chen

Large language models (LLMs) have demonstrated impressive performance in text generation tasks; however, their embedding spaces often suffer from the isotropy problem, resulting in poor discrimination of domain-specific terminology, particularly in legal and financial contexts. This weakness in terminology-level representation can severely hinder downstream tasks such as legal judgment prediction or financial risk analysis, where subtle semantic distinctions are critical. To address this problem, we propose TermGPT, a multi-level contrastive fine-tuning framework designed for terminology adaptation. We first construct a sentence graph to capture semantic and structural relations, and generate semantically consistent yet discriminative positive and negative samples based on contextual and topological cues. We then devise a multi-level contrastive learning approach at both the sentence and token levels, enhancing global contextual understanding and fine-grained terminology discrimination. To support robust evaluation, we construct the first financial terminology dataset derived from official regulatory documents. Experiments show that TermGPT outperforms existing baselines in term di

搜索结果：Terminology

STELLA: Self-Reflective Terminology-Aware Framework for Building an Aerospace Information Retrieval Benchmark

TermGPT: Multi-Level Contrastive Fine-Tuning for Terminology Adaptation in Legal and Financial Domain

Terminology-Aware Translation with Constrained Decoding and Large Language Model Prompting

Learning to Translate Ambiguous Terminology by Preference Optimization on Post-Edits

Building from Scratch: A Multi-Agent Framework with Human-in-the-Loop for Multilingual Legal Terminology Mapping

It Takes Two: A Dual Stage Approach for Terminology-Aware Translation

Attention2Probability: Attention-Driven Terminology Probability Estimation for Robust Speech-to-Text System

TAT-R1: Terminology-Aware Translation with Reinforcement Learning and Word Alignment

LLM-BT-Terms: Back-Translation as a Framework for Terminology Standardization and Dynamic Semantic Embedding

Locate-and-Focus: Enhancing Terminology Translation in Speech Language Models

Defining a Role-Centered Terminology for Physical Representations and Controls

Efficient Terminology Integration for LLM-based Translation in Specialized Domains

Towards Global AI Inclusivity: A Large-Scale Multilingual Terminology Dataset (GIST)

Toward Human-Centered AI-Assisted Terminology Work

MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare

Cascaded Beam Search: Plug-and-Play Terminology-Forcing For Neural Machine Translation

The "negative end" of change in grammar: terminology, concepts and causes

Sound Terminology Describing Production and Perception of Sonification

KpopMT: Translation Dataset with Terminology for Kpop Fandom

A Terminology for Scientific Workflow Systems