共找到 20 条结果
State-of-the-art ASRs show suboptimal performance for child speech. The scarcity of child speech limits the development of child speech recognition (CSR). Therefore, we studied child-to-child voice conversion (VC) from existing child speakers in the dataset and additional (new) child speakers via monolingual and cross-lingual (Dutch-to-German) VC, respectively. The results showed that cross-lingual child-to-child VC significantly improved child ASR performance. Experiments on the impact of the quantity of child-to-child cross-lingual VC-generated data on fine-tuning (FT) ASR models gave the best results with two-fold augmentation for our FT-Conformer model and FT-Whisper model which reduced WERs with ~3% absolute compared to the baseline, and with six-fold augmentation for the model trained from scratch, which improved by an absolute 3.6% WER. Moreover, using a small amount of "high-quality" VC-generated data achieved similar results to those of our best-FT models.
Observation is an essential tool for understanding and studying human behavior and mental states. However, coding human behavior is a time-consuming, expensive task, in which reliability can be difficult to achieve and bias is a risk. Machine learning (ML) methods offer ways to improve reliability, decrease cost, and scale up behavioral coding for application in clinical and research settings. Here, we use computer vision to derive behavioral codes or concepts of a gold standard behavioral rating system, offering familiar interpretation for mental health professionals. Features were extracted from videos of clinical diagnostic interviews of children and adolescents with and without obsessive-compulsive disorder. Our computationally-derived ratings were comparable to human expert ratings for negative emotions, activity-level/arousal and anxiety. For the attention and positive affect concepts, our ML ratings performed reasonably. However, results for gaze and vocalization indicate a need for improved data quality or additional data modalities.
Language models (LMs) have demonstrated remarkable proficiency in generating linguistically coherent text, sparking discussions about their relevance to understanding human language learnability. However, a significant gap exists between the training data for these models and the linguistic input a child receives. LMs are typically trained on data that is orders of magnitude larger and fundamentally different from child-directed speech (Warstadt and Bowman, 2022; Warstadt et al., 2023; Frank, 2023a). Addressing this discrepancy, our research focuses on training LMs on subsets of a single child's linguistic input. Previously, Wang, Vong, Kim, and Lake (2023) found that LMs trained in this setting can form syntactic and semantic word clusters and develop sensitivity to certain linguistic phenomena, but they only considered LSTMs and simpler neural networks trained from just one single-child dataset. Here, to examine the robustness of learnability from single-child input, we systematically train six different model architectures on five datasets (3 single-child and 2 baselines). We find that the models trained on single-child datasets showed consistent results that matched with previo
Despite advancements in ASR, child speech recognition remains challenging due to acoustic variability and limited annotated data. While fine-tuning adult ASR models on child speech is common, comparisons with flat-start training remain underexplored. We compare flat-start training across multiple datasets, SSL representations (WavLM, XEUS), and decoder architectures. Our results show that SSL representations are biased toward adult speech, with flat-start training on child speech mitigating these biases. We also analyze model scaling, finding consistent improvements up to 1B parameters, beyond which performance plateaus. Additionally, age-related ASR and speaker verification analysis highlights the limitations of proprietary models like Whisper, emphasizing the need for open-data models for reliable child speech research. All investigations are conducted using ESPnet, and our publicly available benchmark provides insights into training strategies for robust child speech processing.
We explore ideas and inclusive practices for designing and testing child-centered artificially intelligent technologies for neurodivergent children. AI is promising for supporting social communication, self-regulation, and sensory processing challenges common for neurodivergent children. The authors, both neurodivergent individuals and related to neurodivergent people, draw from their professional and personal experiences to offer insights on creating AI technologies that are accessible and include input from neurodivergent children. We offer ideas for designing AI technologies for neurodivergent children and considerations for including them in the design process while accounting for their sensory sensitivities. We conclude by emphasizing the importance of adaptable and supportive AI technologies and design processes and call for further conversation to refine child-centered AI design and testing methods.
The European Research and Development for Space based High Contrast Imaging II Workshop, held at MPIA in May 2025, advanced Europe strategic coordination in support of future exoplanet imaging missions such as the Habitable Worlds Observatory and the Large Interferometer for Exoplanets mission. Building on the first 2024 workshop, this meeting defined concrete priorities across eight technical areas, including wavefront sensing, coronagraphs, post processing, nulling interferometry, deformable mirrors, detectors, and telescope design. Discussions emphasized Europe strengths in adaptive optics, ground-based facilities, and interferometry, while identifying key gaps, particularly the need for a dedicated European vacuum testbed for high contrast imaging. The community highlighted near infrared or UV coronagraphy as a promising domain for European leadership and called for joint development of advanced data reduction algorithms, detectors, and cross-mission coordination with HWO and LIFE. The workshop outcomes establish a collaborative roadmap to strengthen Europe technological readiness, foster agency partnerships, and ensure its continued leadership in the next generation of space-b
Joint reading is a key activity for early learners, with caregiver-child interactions such as questioning and feedback playing an essential role in children's cognitive and linguistic development. However, for some parents, actively engaging children in storytelling can be challenging. To address this, we introduce TaleMate a platform designed to enhance shared reading by leveraging conversational agents that have been shown to support children's engagement and learning. TaleMate enables a dynamic, participatory reading experience where parents and children can choose which characters they wish to embody. Moreover, the system navigates the challenges posed by digital reading tools, such as decreased parent-child interaction, and builds upon the benefits of traditional and digital reading techniques. TaleMate offers an innovative approach to fostering early reading habits, bridging the gap between traditional joint reading practices and the digital reading landscape.
Energy system optimization models are indispensable for planning the European energy transition. Yet their applicability is constrained by the fundamental trade-off between spatial detail and computational tractability. Modelers often tackle this by spatially aggregating electricity networks. Existing methods, however, neglect differences in voltage levels, reducing them to a single level and thereby overlooking the critical role of transformers in expansion planning. Therefore, we propose a novel voltage-aware network partitioning and aggregation methodology that preserves individual voltage levels and transformers. We demonstrate the effectiveness of this approach and compare it against a voltage-unaware grid aggregation by solving a network expansion problem for a European case study using PyPSA. Our findings show that the proposed methodology preserves up to 70% of the transformer expansion costs in the aggregated model compared to the full grid model, thereby significantly improving the accuracy of investment decisions for transformers in the aggregated grid.
Reliable transcription of child-adult conversations in clinical settings is crucial for diagnosing developmental disorders like Autism. Recent advances in deep learning and availability of large scale transcribed data has led to development of speech foundation models that have shown dramatic improvements in ASR performance. However, their performance on conversational child-adult interactions remains underexplored. In this work, we provide a comprehensive evaluation of ASR performance on a dataset containing child-adult interactions from autism diagnostic sessions, using Whisper, Wav2Vec2, HuBERT, and WavLM. We find that speech foundation models show a noticeable performance drop (15-20% absolute WER) for child speech compared to adult speech in the conversational setting. Then, we fine-tune the best-performing zero-shot model (Whisper-large) using LoRA in a low-resource setting, yielding 8% and 13% absolute WER improvements for child and adult speech, respectively.
With the growing interest in using AI and machine learning (ML) in medicine, there is an increasing number of literature covering the application and ethics of using AI and ML in areas of medicine such as clinical psychiatry. The problem is that there is little literature covering the economic aspects associated with using ML in clinical psychiatry. This study addresses this gap by specifically studying the economic implications of using ML in clinical psychiatry. In this paper, we evaluate the economic implications of using ML in clinical psychiatry through using three problem-oriented case studies, literature on economics, socioeconomic and medical AI, and two types of health economic evaluations. In addition, we provide details on fairness, legal, ethics and other considerations for ML in clinical psychiatry.
The Proton Improvement Plan (PIP-II) to the FNAL accelerator chain and the Long-Baseline Neutrino Facility (LBNF) will provide the world's most intense neutrino beam to the Deep Underground Neutrino Experiment (DUNE) enabling a wide-ranging physics program. This document outlines the significant contributions made by European national laboratories and institutes towards realizing the first phase of the project with a 1.2 MW neutrino beam. Construction of this first phase is well underway. For DUNE Phase II, this will be closely followed by an upgrade of the beam power to > 2 MW, for which the European groups again have a key role and which will require the continued support of the European community for machine aspects of neutrino physics. Beyond the neutrino beam aspects, LBNF is also responsible for providing unique infrastructure to install and operate the DUNE neutrino detectors at FNAL and at the Sanford Underground Research Facility (SURF). The cryostats for the first two Liquid Argon Time Projection Chamber detector modules at SURF, a contribution of CERN to LBNF, are central to the success of the ongoing execution of DUNE Phase I. Likewise, successful and timely procurem
Researchers and policy-makers have started creating frameworks and guidelines for building machine-learning (ML) pipelines with a human-centered lens. Machine Learning pipelines stand for all the necessary steps to develop ML systems (e.g., developing a predictive keyboard). On the other hand, a child-centered focus in developing ML systems has been recently gaining interest as children are becoming users of these products. These efforts dominantly focus on children's interaction with ML-based systems. However, from our experience, ML pipelines are yet to be adapted using a child-centered lens. In this paper, we list the questions we ask ourselves in adapting human-centered ML pipelines to child-centered ones. We also summarize two case studies of building end-to-end ML pipelines for children's products.
This document is submitted as input to the European Strategy for Particle Physics Update (ESPPU). The U.S.-based Electron-Ion Collider (EIC) aims at understanding how the complex dynamics of confined quarks and gluons makes up nucleons, nuclei and all visible matter, and determines their macroscopic properties. In April 2024, the EIC project received approval for critical-decision 3A (CD-3A) allowing for Long-Lead Procurement, bringing its realization another step closer. The ePIC Collaboration was established in July 2022 around the realization of a general purpose detector at the EIC. The EIC is based in U.S.A. but is characterized as a genuine international project. In fact, a large group of European scientists is already involved in the EIC community: currently, about a quarter of the EIC User Group (consisting of over 1500 scientists) and 29% of the ePIC Collaboration (consisting of $\sim$1000 members) is based in Europe. This European involvement is not only an important driver of the EIC, but can also be beneficial to a number of related ongoing and planned particle physics experiments at CERN. In this document, the connections between the scientific questions addressed at C
The European Strategy for Particle Physics (ESPP) - 2026 update is taking place in a turbulent international climate. Many of the norms that have governed relations between states for decades are being broken or challenged. The future progress of science in general, and particle physics in particular, will depend on our ability to maintain peaceful international scientific collaboration in the face of political pressures. We plead that the ESPP 2026 update acknowledge explicitly the importance of peaceful international scientific collaboration, not only for the progress of science, but also as a precious bridge between geopolitical blocs. "Scientific thought is the common heritage of mankind" - Abdus Salam
Axions and other very weakly interacting slim (with $m <$ 1 GeV) particles (WISPs) are a common feature of several extensions of the Standard Model of Particle Physics. The search of WISPs was already recommended in the last update of the European strategy on particle physics (ESPP). After that, the physics case for WISPs has gained additional momentum. Indeed, WISPs may provide a new paradigm to explain the nature of dark matter and puzzling astrophysical and particle physics observations. This document briefly summarizes current searches for WISPs and the perspectives in this research field for the next decade, ranging from their theoretical underpinning, over their indirect observational consequences in astrophysics, to their search in laboratory experiments. It is stressed that in Europe a rich, diverse, and low-cost experimental program is already underway with the potential for one or more game-changing discoveries. In this context, it is also reported the role of the EU funded COST Action ''Cosmic WISPers in the Dark Universe: Theory, astrophysics, and experiments'' (CA21106, https://www.cost.eu/actions/CA21106) in coordinating and supporting WISPs searches in Europe, sha
Heavy-flavour physics is an essential component of the particle-physics programme, offering critical tests of the Standard Model and far-reaching sensitivity to physics beyond it. Experiments such as LHCb, Belle II, and BESIII drive progress in the field, along with contributions from ATLAS and CMS. The LHCb Upgrade II and upgraded Belle II experiments will provide unique and highly sensitive measurements for decades, playing a key role in the searches for new physics. Future facilities with significant heavy-flavour capabilities will further expand these opportunities. We advocate for a European Strategy that fully supports Upgrade II of LHCb and an upgrade of Belle II, along with their subsequent exploitation. Additionally, we support a long-term plan that fully integrates flavour physics in an $e^+e^-$ collider to run as a $Z$ factory.
The Brazilian High-Energy Physics (HEP) community has expanded remarkably since its first involvement at CERN and Fermilab in the 1980s. Its recent organization under the Brazilian Network for High-Energy Physics (RENAFAE), since 2008, has further strengthened its scientific and technological goals, particularly in detector instrumentation, computing, and industry partnerships. In 2024, Brazil became an Associate Member State of CERN, opening new opportunities for deeper engagement in accelerator and detector R&D. This input to the 2026 update of the European Strategy for Particle Physics highlights Brazil's current participation in LHC experiments as well as ongoing developments in detector and accelerator technology, and details the community's view towards future colliders. The potential for expanded scientific and industrial collaborations between Brazil and CERN is also discussed.
In view of the European Strategy for Particle Physics process, the French HEP community has organized a national process of collecting written contributions and has pursued a series of workshops culminating with a national symposium held in Paris on January 20-21, 2025 that involved over 280 scientists https://indico.in2p3.fr/event/34662/. The present document summarises the main conclusions of this bottom-up approach centred on the physics and technology motivations.
This document represents a contribution of the United States early career collider physics community to the 2025--2026 update to the European Strategy for Particle Physics. Preferences with regard to different future collider options and R&D priorities were assessed via a survey. The early career community was defined as anyone who is a graduate student, postdoctoral researcher, untenured faculty member, or research scientist under 40 years of age. In total, 105 participants responded to the survey between February and March 10th, 2025. Questions were formulated primarily to gauge the enthusiasm and preferences for different collider options in line with the recommendations of the United States' P5 report, relevant to the European Strategy Update.
Using YouTube Kids as an example, in this work, we argue the need to understand a child's interaction process with AI and its broader implication on a child's emotional, social, and creative development. We present several design recommendations to create value-driven interaction in child-centric AI that can guide designing compelling, age-appropriate, beneficial AI experiences for children.