搜索结果：Vox sanguinis

共找到 20 条结果

高级筛选 ▾

VoxServe: Streaming-Centric Serving System for Speech Language Models

arXiv2026-01-30作者：Keisuke Kamahori, Wei-Tzu Lee, Atindra Jha

Deploying modern Speech Language Models (SpeechLMs) in streaming settings requires systems that provide low latency, high throughput, and strong guarantees of streamability. Existing systems fall short of supporting diverse models flexibly and efficiently. We present VoxServe, a unified serving system for SpeechLMs that optimizes streaming performance. VoxServe introduces a model-execution abstraction that decouples model architecture from system-level optimizations, thereby enabling support for diverse SpeechLM architectures within a single framework. Building on this abstraction, VoxServe implements streaming-aware scheduling and an asynchronous inference pipeline to improve end-to-end efficiency. Evaluations across multiple modern SpeechLMs show that VoxServe achieves 10-20x higher throughput than existing implementations at comparable latency while maintaining high streaming viability. The code of VoxServe is available at https://github.com/vox-serve/vox-serve.

Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits

arXiv2025-05-20作者：Tiantian Feng, Jihwan Lee, Anfeng Xu

We introduce Vox-Profile, a comprehensive benchmark to characterize rich speaker and speech traits using speech foundation models. Unlike existing works that focus on a single dimension of speaker traits, Vox-Profile provides holistic and multi-dimensional profiles that reflect both static speaker traits (e.g., age, sex, accent) and dynamic speech properties (e.g., emotion, speech flow). This benchmark is grounded in speech science and linguistics, developed with domain experts to accurately index speaker and speech characteristics. We report benchmark experiments using over 15 publicly available speech datasets and several widely used speech foundation models that target various static and dynamic speaker and speech properties. In addition to benchmark experiments, we showcase several downstream applications supported by Vox-Profile. First, we show that Vox-Profile can augment existing speech recognition datasets to analyze ASR performance variability. Vox-Profile is also used as a tool to evaluate the performance of speech generation systems. Finally, we assess the quality of our automated profiles through comparison with human evaluation and show convergent validity. Vox-Profile

搜索结果：Vox sanguinis

VoxServe: Streaming-Centric Serving System for Speech Language Models

Vox-Profile: A Speech Foundation Model Benchmark for Characterizing Diverse Speaker and Speech Traits

Vox-Surf: Voxel-based Implicit Surface Representation

Vox-Evaluator: Enhancing Stability and Fidelity for Zero-shot TTS with A Multi-Level Evaluator

Vox Deorum: A Hybrid LLM Architecture for 4X / Grand Strategy Game AI -- Lessons from Civilization V

Giant spin shift current in two-dimensional altermagnetic multiferroics VOX$\mathrm{_2}$

Recent Developments of the VOXES Von Hamos X-ray Spectrometer for Laboratory XES and XAS Studies

Control of ferromagnetism of Vanadium Oxide thin films by oxidation states

Vox-Fusion++: Voxel-based Neural Implicit Dense Tracking and Mapping with Multi-maps

VOX-KRIKRI: Unifying Speech and Language through Continuous Fusion

Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation

FR-LIO: Fast and Robust Lidar-Inertial Odometry by Tightly-Coupled Iterated Kalman Smoother and Robocentric Voxels

Prediction of Magnetoelectric Multiferroic Janus Monolayers VOXY(X/Y = F, Cl, Br, or I, and X$ ot=$Y) with in-plane ferroelectricity and out-of-plane piezoelectricity

Vox-UDA: Voxel-wise Unsupervised Domain Adaptation for Cryo-Electron Subtomogram Segmentation with Denoised Pseudo Labeling

BiMind: A Dual-Head Reasoning Model with Attention-Geometry Adapter for Incorrect Information Detection

Detecting Iron Oxidation States in Liquids with the VOXES Bragg Spectrometer

Low Temperature Formation of Crystalline VO2 Domains in Porous 1 Nanocolumnar Thin Films for Thermochromic Applications

Vox Populi, Vox AI? Using Language Models to Estimate German Public Opinion

Vox Populi, Vox ChatGPT: Large Language Models, Education and Democracy

Data Quality Issues in Multilingual Speech Datasets: The Need for Sociolinguistic Awareness and Proactive Language Planning