搜索 — ResearchTracker

This study focuses on building effective spoofing countermeasures (CMs) for non-native speech, specifically targeting Indonesian and Thai speakers. We constructed a dataset comprising both native and non-native speech to facilitate our research. Three key features (MFCC, LFCC, and CQCC) were extracted from the speech data, and three classic machine learning-based classifiers (CatBoost, XGBoost, and GMM) were employed to develop robust spoofing detection systems using the native and combined (native and non-native) speech data. This resulted in two types of CMs: Native and Combined. The performance of these CMs was evaluated on both native and non-native speech datasets. Our findings reveal significant challenges faced by Native CM in handling non-native speech, highlighting the necessity for domain-specific solutions. The proposed method shows improved detection capabilities, demonstrating the importance of incorporating non-native speech data into the training process. This work lays the foundation for more effective spoofing detection systems in diverse linguistic contexts.

The Impact of Editorial Intervention on Detecting Native Language Traces

arXiv2026-05-11作者：Ahmet Yavuz Uluslu, Mark Gales, Kate Knill

Native Language Identification (NLI) is the task of determining an author's native language (L1) from their non-native writings. With the advent of human-AI co-authorship, non-native texts are routinely corrected and rewritten by large language models, fundamentally altering the linguistic features NLI models depend on. In this paper, we investigate the robustness of L1 traces across increasing degrees of editorial intervention. By processing 450 essays from the Write & Improve 2024 corpus through varying levels of grammatical error correction (GEC) and paraphrasing, we demonstrate that L1 attribution does not entirely depend on surface-level errors. Instead, the detection models leverage deeper L1 features: unidiomatic lexico-semantic choices, pragmatic transfer, and the author's underlying cultural perspective. We find that minimal edits preserve these structural traces and maintain high profiling accuracy. In contrast, fluency edits and paraphrasing normalize these L1 features, leading to a severe degradation in performance.

搜索结果：native

Detecting Spoof Voices in Asian Non-Native Speech: An Indonesian and Thai Case Study

The Impact of Editorial Intervention on Detecting Native Language Traces

Reducibility of native weighted graphs on Rydberg Arrays

Toward Native Multimodal Modeling: A Roadmap

Native Design Bias: Studying the Impact of English Nativeness on Language Model Performance

Modality-Native Routing in Agent-to-Agent Networks: A Multimodal A2A Protocol Extension

Native Segmentation Vision Transformers

Tensor Manifold-Based Graph-Vector Fusion for AI-Native Academic Literature Retrieval

Non-native English lexicon creation for bilingual speech synthesis

APEX: A Network-Native Time-Series Foundation Model for Forecasting and Anomaly Detection for Wireless Edge Operations

Using Sentiment Analysis to Investigate Peer Feedback by Native and Non-Native English Speakers

Characterizing Model-Native Skills

Native Reasoning Models: Training Language Models to Reason on Unverifiable Data

Lazy Quantum Walks with Native Multiqubit Gates

ReuNify: A Step Towards Whole Program Analysis for React Native Android Apps

VectraYX-Nano: A 42M-Parameter Spanish Cybersecurity Language Model with Curriculum Learning and Native Tool Use

Cloud Native System for LLM Inference Serving

Native vs Non-Native Language Prompting: A Comparative Analysis

From Automated to Autonomous: Hierarchical Agent-native Network Architecture (HANA)

Detecting Generated Native Ads in Conversational Search