Parent-child interaction (PCI) interventions have the potential to mitigate early-identified risks of Speech, Language and Communication Needs (SLCN). PCI interventions can be delivered at Universal, Targeted and Specialist levels, but evidence for effectiveness at the Universal level is lacking, especially for some populations. We examine the acceptability of a universal PCI intervention for two underserved groups: children who have SLCN and/or are multilingual. For the former group we also explore acceptability of a supplementary, targeted intervention. This study aimed to: (a) evaluate the acceptability of a digital early years PCI support service-comprising a universal text-message service delivering BBC Tiny Happy People videos and targeted speech and language therapy following the Early Language Identification Measure & Intervention (ELIM-I); (b) establish the interest of families with children who have, or are at risk of, SLCN (N = 61) and/or are multilingual (N = 26) in utilising the service, and explore their perceptions regarding merits and drawbacks of the service, and elicit recommendations for improvements. We employed a mixed-methods approach. Quantitative data were collected via questionnaires based on the Theoretical Framework of Acceptability. Qualitative data were gathered through semi-structured online interviews. Families of children with SLCN provided prospective acceptability data after reviewing three videos, indicating their view of receiving similar weekly video content via text message. Those who then opted to try the text service for a month provided retrospective acceptability data, with additional questions for participants who received the targeted online ELIM-I intervention. Multilingual families received the service for three months before providing retrospective acceptability data. Quantitative analyses revealed that all acceptability ratings were high on average, though there was individual variability. Reflexive thematic analysis of caregivers' qualitative data identified three central themes: (a) demand for trustworthy guidance to address uncertainty; (b) positives including service suitability for busy family life, personalisation, human connection and reassurance, enjoyment and perceived efficacy; (c) a need for inclusive content, especially for children with complex SLCN. There is a clear desire for early digital services to help caregivers support their children's language development. Acceptability was generally high. Caregivers wanted to see their family represented in video content. This was largely successful for the multilingual group with content celebrating home languages. Caregivers of children with SLCN sometimes felt under-represented and recommended demonstrating support strategies appropriate for their child's age and stage of development. What is already known on this subject PCI interventions have the potential to mitigate the risk of SLCN from the early years. There has been interest in creating digital interventions that can reach families universally at low cost and BBC Tiny Happy People have developed video content along these lines. However, previous evaluations of digital services using BBC Tiny Happy People materials, have excluded multilingual families and families who have children with/at risk of SLCN. What this paper adds to existing knowledge This study highlights that a digital support service comprising universal weekly text messages sharing BBC Tiny Happy People video content and targeted online ELIM-I intervention is acceptable and desirable to families whose children have SLCN and/or are multilingual. Caregivers highlighted necessary changes to cater for diversity, particularly the need for appropriate content for families whose children have SLCN. What are the potential or actual clinical implications of this study? Findings support the use of a public health framework to provide families with accessible and evidence-based support (in this case, a universal digital service to promote PCI in the early years with an additional, targeted ELIM-I intervention for those with SLCN). This could remove pressure from in-person services by preventing need, supporting triaging, enabling families to support their child whilst waiting for speech and language therapy and to be more ready for specialist intervention if this is needed. Selecting content that best matches families' characteristics increases acceptability. 1. Digital Services are Acceptable as an Initial Layer of Intervention Caregivers rated the digital language support service highly. They felt that it addressed their curiosity and concern about their child's development and that its benefits persisted regardless of whether caregivers were accessing other support. This indicates that a digital, universal service can function as an effective initial layer of intervention, providing support to individuals on waiting lists and contributing to the management of speech and language therapy caseloads. 2. Inclusive and Diverse Content is Essential to Avoid Caregiver Distress and Harm Content that is representative of family diversity is essential in universal digital language support services. Caregiver feedback highlighted the profound impact of representation: a multilingual family articulated feeling 'connected with the people' when presented with relevant content. Conversely, a perceived lack of content reflective of complex needs was associated with participant distress and reduced engagement or withdrawal from the service. These experiences highlight an ethical responsibility to develop inclusive and representative content that reflects a wide range of developmental trajectories, family structures and cultures. 3. The Speech and Language Therapist's Pivotal Role in Personalising Universal Digital Services The study demonstrated that the human element of the service, such as personalised messages and text message and phone or video call interaction, was highly valued by caregivers. This human connection helped offset the limitations of a universal digital service, providing the opportunity to clarify misunderstandings and a sense of being listened to and understood. This finding indicates that a successful digital service is not about replacing human interaction; it is about using digital platforms to complement and extend the reach of their professional role.
Motor-speech skills slow down with age, but health care professionals lack normative data, especially on the vastly growing population of very old (VO) speakers. The execution of different motor-speech tasks requires both fine-motoric and cognitive abilities. To study the performance on oral diadochokinetic (DDK) rate and narrative speech tempo in typically ageing 80-100-year-old speakers and to investigate whether they are predicted by age, dentition, hearing, cognitive status, language skills or educational level. This cross-sectional study comprises 50 typically ageing VO Finnish speakers. Their motor-speech performance was evaluated by alternating motion rate (AMR) syllables /pa/, /ta/ and /ka/ and sequential motion rate (SMR) syllable sequence /pataka/ and two speech tempo parameters (speaking and articulation rate) in semi-spontaneous narrative. The association between task performance and background variables was studied by multiple linear regression analysis. The VO speakers' normative performance in DDK, speaking and articulation rates was predicted by physio-anatomical and cognitive-linguistic factors. Older age within the 80-100-year range was associated only with slower execution of the SMR task. Wearing dentures predicted slower tempo in the AMR tasks and articulation rate. The highest educational level predicted slower tempo in the AMR tasks. Good language skills were positively associated with motor-speech performance: Better phonemic fluency predicted faster AMR /pa/ and SMR /pataka/, and a higher Western Aphasia Battery Aphasia Quotient predicted a faster speaking rate. The VO speakers had relatively well-preserved motor-speech skills. Consistent with previous studies, the mean DDK, speaking and articulation rates were, nevertheless, slower in the VO speakers than in younger speakers in prior research. As a novel finding, SMR was slower than AMR in the VO speakers, which deviates from the trend observed in Finnish adults and younger elderly. This study suggests that natural teeth, younger age and good language skills safeguard the motor-speech skills from slowing down. However, it seems characteristic for the most highly educated VO speakers to perform slower than peers in the AMR tasks. The results of this study will help to identify the manifestations of typical ageing. They also give insight into the life-long evolution of speech skills and into the relationship between the motoric and linguistic facets of speech. What is already known on the subject Motor-speech performance slows down with ageing because of physio-anatomical and cognitive-linguistic changes. Speech tempo is, thus, a sensitive biomarker for neurological alterations. What this study adds to existing knowledge Formerly missing data on very old (VO) speakers' typical motor-speech skills and their predictors. Within-group differences are promoted by age, dentition, language skills and educational level. What are the clinical implications of this work? DDK, speaking and articulation rates are suitable means for clinical motor-speech assessment even in the oldest speakers. In addition to chronological age, speech-language pathologists should consider a range of individual physio-anatomical and cognitive-linguistic factors that contribute to speech performance. We recommend conducting AMR tasks /pa/, /ta/ and /ka/ together with an SMR /pataka/ task and comparing the results with culturally valid age-specific norms (if available), as their atypical relationship may indicate neuropathology.
Globally, there is still limited understanding of how Speech-Language Pathologists (SLPs) assess and treat multilingual people with aphasia (MPWA). This article presents results from the Multilingual Aphasia Practices (MAP) survey-an extensive international study involving 407 SLPs working across 60 countries. The MAP survey explored: 1) the multilingual background of SLPs and the languages they incorporate into service delivery, 2) their knowledge and professional training related to multilingualism and multilingual aphasia, and 3) their workplace contexts and client profiles. A large proportion of respondents (79.7%) identified as multilingual and reported using numerous languages in their practice. However, formal training in multilingualism was often minimal. Only 25.06% had completed a course focused on multilingualism, and just 10.07% took a full course specific to multilingual aphasia. Most participants (87.2%) reported major gaps in knowledge and training, particularly regarding best-practice recommendations, supervised clinical experience, and guidance on assessment and intervention for MPWA. Many expressed a strong desire for additional professional development in these areas. Clinical exposure to MPWA varied widely. While 27% of respondents reported daily contact, 26.1% encountered MPWA once or twice per week, and 22.8% indicated that they worked with MPWA only a few days per month. Overall, our findings point to persistent and widespread global gaps in training, resources, and clinical readiness for working with MPWA. The results underscore an urgent need to enhance multilingualism-focused education in SLP programs, establish international best-practice frameworks, develop and disseminate culturally and linguistically appropriate assessment and treatment materials.
Language influences our thinking and affects many aspects of cognition, from how we perceive the world to how we interact socially. Thus, objectively characterizing linguistic background is crucial for research in many areas, including second language acquisition, psycho-linguistics, and cognitive science. Traditional language proficiency tests, however, are manually composed by experts, limiting their scope for both lab and online settings. Here, we propose a pipeline that automatically derives a language proficiency test from a corpus of text and applies it to create new tests for 1,939 languages. Using this approach, we conducted a large-scale survey examining L1 and L2 proficiency across 34 countries, with participants tested on all 34 languages. Drawing from human ratings from 4,137 participants, our results validate that our test can effectively distinguish native speakers, second-language speakers, and nonspeakers within one minute, making it an effective tool for evaluating linguistic proficiency. We show that participants' linguistic and demographic backgrounds systematically influence both their language proficiency and their self-reported skills, and we map the prevalence of global languages, such as English and Spanish, among online participants. Moreover, we show that our vocabulary tests are strongly correlated with other linguistic competences-such as listening and writing-in a set of typologically varied languages, demonstrating our test is an efficient instrument to assess language proficiency. More broadly, our work offers a significant resource for investigating global variation in language skills and contributes to reducing the overreliance on the English language in the cognitive and social sciences.
Automated speech and language analysis (ASLA) is gaining momentum as a noninvasive, affordable, and scalable approach for the early detection of Alzheimer disease (AD). Nevertheless, the literature presents 2 notable limitations. First, many studies use computationally derived features that lack clinical interpretability. Second, a significant proportion of ASLA studies have been conducted exclusively in English speakers. These shortcomings reduce the utility and generalizability of existing findings. To address these gaps, we investigated whether interpretable linguistic features can reliably identify AD both within and across language boundaries, focusing on English- and Spanish-speaking patients and healthy controls (HCs). We analyzed speech recordings from 211 participants, encompassing 117 English speakers (58 patients with AD and 59 HCs) and 94 Spanish speakers (47 patients with AD and 47 HCs). Participants completed a validated picture description task from the Boston Diagnostic Aphasia Examination, eliciting natural speech under controlled conditions. Recordings were preprocessed and transcribed before extracting (1) speech timing features (eg, pause duration, speech segment ratios, and voice rate) and (2) lexico-semantic features (lexical category ratios, semantic granularity, and semantic variability). Machine learning classifiers were trained with data from English-speaking patients and HCs, and then tested (1) in a within-language setting (with English-speaking patients and HCs) and (2) in a between-language setting (with Spanish-speaking patients and HCs). Additionally, the features were used to predict cognitive functioning as measured by the Mini-Mental State Examination (MMSE). In the within-language condition, combined speech timing and lexico-semantic features yielded maximal classification (area under the receiver operating characteristic curve [AUC]=0.88), outperforming single-feature models (AUC=0.79 for timing features; AUC=0.80 for lexico-semantic features). Timing features showed the strongest MMSE prediction (R=0.43, P<.001). In the between-language condition, speech timing features generalized well to Spanish speakers (AUC=0.75) and predicted Spanish-speaking patients' MMSE scores (R=0.39, P<.001). Lexico-semantic features showed lower performance (AUC=0.64) and no significant MMSE prediction (R=-0.31, P=.05). The combined model did not improve results (AUC=0.65; R=0.04, P=.79). These results suggest that while both timing and lexico-semantic features are informative within the same language, only speech timing features demonstrate consistent performance across languages. By focusing on clinically interpretable features, this approach supports the development of clinically usable ASLA tools.
This study aims to examine differences in vocabulary performance among Arabic-speaking children ages 1;6-3;6, using a three-domain theoretical framework (biological/genetic, developmental, and social/environmental factors), with particular focus on how these factors differ based on the presence of parental concerns. Data were collected from 874 parents of Palestinian Arabic-speaking children aged 18-36 months using the online Palestinian Arabic Communicative Development Inventory (PA-CDI). An accompanying background questionnaire was used to gather information on five potential risk factors. Children with parental concerns demonstrated significantly lower vocabulary performance across all age groups. The proportion of concerned parents decreased as vocabulary percentile increased. All five examined risk factors correlated significantly with vocabulary performance. Parental concerns emerged as the strongest predictor overall. For children without parental concerns, onset of speaking was the only significant predictor, while for children with concerns, word combination abilities were the sole significant predictor. Our findings highlight the validity of parental concerns as indicators of potential language difficulties and underscore the importance of early language milestones in predicting vocabulary performance. Word combination abilities, in particular, seem valuable for identifying children with persistent language difficulties. The PA-CDI, combined with background variables, shows promise as an effective clinical tool for early identification of children at risk for language delays in Arabic-speaking populations. What is already known on this subject Research has consistently demonstrated the validity and importance of parental concerns in identifying children's developmental issues, particularly in the early detection of DLD. Previous studies have established that parental observations can be as reliable as quality developmental screening tests. Early language milestones, including onset of speaking and word combinations, have been identified as significant predictors of vocabulary development. Risk factors for language development have been categorised into three domains: biological/genetic factors (family history, health conditions), developmental indicators (delayed milestones), and social/environmental factors. The CDI has emerged as a valuable tool for assessing vocabulary development and identifying children at risk for DLD across various languages. What this study adds to existing knowledge This study reveals that parental concerns emerge as the strongest predictor of vocabulary performance in Arabic, an understudied language in this context, accounting for 27.2% of variance in language development. For the first time, we demonstrate differential predictive patterns between children with and without parental concerns: Onset of speaking predicts performance for children without concerns, while word combination abilities serve as the sole significant predictor for children with concerns. The study validates a three-domain framework in Arabic-speaking populations and shows that subjective factors demonstrate significantly stronger relationships with vocabulary performance than objective indicators, challenging traditional assumptions about measurement reliability. What are the potential or actual clinical implications of this work? These findings highlight the value of considering parental perspectives alongside standardised assessments to gain a comprehensive understanding of children's linguistic abilities in Arabic-speaking populations. Healthcare professionals should recognise that parental reports, when systematically collected, may provide more sensitive indicators of language development than traditional objective measures. Clinical assessment protocols should be restructured to give greater weight to structured parental observations. Word combination abilities, in particular, seem valuable for identifying children with persistent language difficulties. The PA-CDI, combined with background variables, shows promise as an effective clinical tool for early identification of children at risk for language delays in Arabic-speaking populations.
Systematic reviews (SRs) are a cornerstone in providing high-quality evidence that guides policy and practice across various disciplines. Despite their critical role, SRs require substantial financial investment and are constrained by time-consuming manual processes. Existing solutions primarily focus on semi-automating the title and abstract screening stages, yet these approaches still face limitations in terms of efficiency and practicality. The SR process comprises several stages beyond abstract screening, each of which is resource-intensive. To overcome these challenges, this paper introduces ReviewGenie, a novel system that automates SR stages up to and including abstract screening, utilizing artificial intelligence. The SR process involves eight key stages, beginning with the definition of search keywords and the selection of target databases, and culminating in full screening. While the initial and final stages require human expertise, the intermediate stages can be automated. ReviewGenie automates all intermediary stages, including database searching, data retrieval, cleaning, deduplication, filtering, and abstract screening. The system is domain-agnostic, as evidenced by a case study focused on databases related to speech and language disorders. ReviewGenie significantly reduces the workload across various stages of the SR process, delivering notable time and cost savings while enhancing efficiency and accuracy. In the case study, where the article-fetching stage involved tens of thousands of publications, ReviewGenie achieved a 2.62% improvement in duplicate detection in less than a second, compared to the 1 to 3 h typically required for manual deduplication of 100 records. This process included cleaning abstracts before removing duplicates. Additionally, ReviewGenie reduced the number of articles from 28,674 to 3520 using an automatic filtering approach executed in seconds. This substantial reduction underscores the effectiveness of our automated method in preparing datasets for the abstract screening stage. Moreover, the artificial intelligence-driven abstract screening method resulted in cost savings exceeding $6230 compared to manual methods. ReviewGenie represents a significant advancement in reducing the burden on researchers conducting comprehensive systematic reviews. By automating intermediate stages, ReviewGenie enhances efficiency, accuracy, and cost-effectiveness, establishing itself as an indispensable tool for SRs across various disciplines.
The COVID-19 pandemic disproportionately affected frail individuals, especially those living in long-term care (LTC) homes. This study examined the role of linguistic factors on COVID-19 related outcomes in LTC homes. We performed a population-based, retrospective cohort study of residents living in LTC homes in Ontario, Canada who were diagnosed with COVID-19 between March 31, 2020 and March 31, 2021. Resident language, obtained from LTC assessments, was used to classify residents into one of the three linguistic groups: Anglophone (English), Francophone (French), and allophone (other language). Language of the LTC home was determined using a person-time representation of the languages spoken by residents within each LTC home. We defined LTC facilities as French homes when Francophone residents contributed more than 25% of the person-days, and allophone homes when allophone residents contributed more than 50% of the person-days. Residents whose language corresponded to the language of the LTC home in which they were living were said to have received language-concordant care, while all other residents were said to have received language-discordant care. The outcomes of this study were ED visits, hospitalizations, and mortality within 90 days. We included a total of 26,829 LTC residents (20,315 Anglophones, 1,032 Francophones, and 5,482 allophones) living in 572 LTC homes (502 English, 28 French, 42 allophone) who were diagnosed with COVID-19. LTC residents who lived in language-discordant homes were more likely to have ED visits (adjusted HR 1.12, 95% CI 1.01-1.25) and hospitalizations (adjusted HR 1.15, 95% CI 1.02-1.29) when compared to LTC residents who lived in language-concordant homes. Residents-facility language discordance was not associated with overall mortality (adjusted HR 1.00, 95% CI 0.91-1.10) or in hospital mortality (adjusted HR 1.04, 95% CI 0.88-1.23). Residents living in language-discordant LTC facilities experienced more ED visits and hospitalizations following diagnosis of COVID-19. The findings of this study highlight the importance of providing frail, vulnerable individuals with linguistically concordant care.
Generative artificial intelligence (AI) has the potential to be used in supporting people with eating disorders (EDs), but this also presents certain risks. This study aimed at comparing the psycholinguistic attributes (language markers of cognitive, emotional, and social processes) and lexico-semantic characteristics (patterns of word choice and meaning in text), and assessing potential harms of AI responses versus human responses in online communities (OCs). We collected pre-COVID data from Reddit communities on EDs, consisting of 3634 posts and 22,359 responses. For each post, responses were generated using four widely used state-of-the-art AI models (GPT, Gemini, Llama, and Mistral) with prompts tailored to peer support. The Linguistic Inquiry and Word Count (LIWC) lexicon was used to examine psycholinguistic features across eight dimensions, and a suite of lexico-semantic comparisons was conducted across the dimensions of linguistic structure, style, and semantics. Additionally, 100 AI-generated responses were qualitatively analyzed by clinicians to identify potential harm. Using OC responses as a comparison, AI responses were generally longer, more polite, yet more repetitive and less creative than human responses. Empathy scores varied among models. Qualitative analysis revealed themes of possible reinforcement of ED behaviors, implicit biases (e.g., favoring weight loss), and an inability to acknowledge contextual nuances-such as insensitivity to emotional cues and overgeneralized health advice. All AI chatbots produced responses containing harmful content, such as promoting ED behaviors or biases, to varying degrees. Findings highlight differences between AI and OC responses, with potential risks of harm when using AI in ED peer support. Ethical considerations include the need for safeguards to prevent reinforcement of harmful behaviors and biases. This research underscores the importance of cautious AI integration; further validation, and the development of guidelines are needed to ensure safe and effective support.
Ensuring safe and effective use of artificial intelligence (AI) requires understanding and anticipating its performance on new tasks, from advanced scientific challenges to transformed workplace activities1-3. So far, benchmarking has guided progress in AI but has offered limited explanatory and predictive power for general-purpose AI systems4-8, attributed to limited transferability across specific tasks9-11. Here we introduce general scales for AI evaluation that elicit demand profiles explaining what capabilities common AI benchmarks truly measure, extract ability profiles quantifying the general strengths and limits of AI systems and robustly predict AI performance for new task instances. Our fully automated methodology builds on 18 rubrics, capturing a broad range of cognitive and intellectual demands, which place different task instances on the same general scales, illustrated on 15 large language models (LLMs) and 63 tasks. Both the demand and the ability profiles on these scales bring new insights such as construct validity through benchmark sensitivity and specificity and explain conflicting claims about whether AI has reasoning capabilities. Ultimately, high predictive power at the instance level becomes possible using the general scales, providing superior estimates over strong black-box baseline predictors, especially in out-of-distribution settings (new tasks and benchmarks). The scales, rubrics, battery, techniques and results presented here constitute a solid foundation for a science of AI evaluation, underpinning the reliable deployment of AI in the years ahead.
This review aims to explore the potential and challenges of using Natural Language Processing (NLP) to detect, correct, and mitigate medically inaccurate information, including errors, misinformation, and hallucination. By unifying these concepts, the review emphasizes their shared methodological foundations and their distinct implications for healthcare. Our goal is to advance patient safety, improve public health communication, and support the development of more reliable and transparent NLP applications in healthcare. A scoping review was conducted following PRISMA-ScR guidelines, analyzing studies from 2020 to 2024 across five databases. Studies were selected based on their use of NLP to address medically inaccurate information and were categorized by topic, tasks, document types, datasets, models, and evaluation metrics. NLP has shown potential in addressing medically inaccurate information on the following tasks: (1) error detection (2) error correction (3) misinformation detection (4) misinformation correction (5) hallucination detection (6) hallucination mitigation. However, challenges remain with data privacy, context dependency, and evaluation standards. This review highlights the advancements in applying NLP to tackle medically inaccurate information while underscoring the need to address persistent challenges. Future efforts should focus on developing real-world datasets, refining contextual methods, and improving hallucination management to ensure reliable and transparent healthcare applications.
This study examines associations between patient demographics, clinical status, and linguistic features of text messages with engagement in a message-based intervention for serious mental illness. Data from a randomized controlled trial of a message-based mental health intervention were analyzed. Engagement was operationalized as total texts sent per day and total number of disengaged days. Linguistic Inquiry and Word Count identified expressions of affect, social processes, thinking styles, health, and time orientation. Generalized estimating equations assessed associations between demographic, clinical, and Linguistic Inquiry and Word Count variables with engagement across three different time intervals. Among 39 participants, most were male (n = 23, 59%), with diagnoses of schizophrenia (n = 16, 41%), schizoaffective disorder (n = 9, 23%), bipolar disorder (n = 9, 23%), and major depressive disorder (n = 5, 13%). Participants sent approximately two messages per day, with 48% of days disengaged. Race, education, and diagnosis were associated with engagement. Black participants and those with at least some college education sent more texts while individuals with schizophrenia had more disengaged days. Messages containing language about anxiety, friendship, cognitive processes, and common verbs were associated with engagement. Significant relationships between message content and future engagement were observed, particularly in the first 2 weeks, as well as in messages sent the day and week before a disengaged day. Demographic, clinical, and linguistic features are related to engagement in message-based interventions for serious mental illness. Identifying these characteristics can help tailor interventions, enhancing engagement, and reducing dropout rates in digital mental health interventions. (PsycInfo Database Record (c) 2026 APA, all rights reserved).
Predictions of Health-Related Quality of Life (HRQoL) outcomes could support realistic recovery expectations after breast cancer (BC) surgery. We aimed to develop and validate prediction models for HRQoL outcomes after BC surgery. We used three datasets of BC patients from Berlin, Germany; Ljubljana, Slovenia; and Rotterdam; Netherlands. We included non-metastasised patients who were surgically treated for an initial diagnosis of BC and completed pre- and postoperative validated questionnaires. We used linear mixed models to analyse 15 domains of the EORTC QLQ-C30 and EORTC QLQ-BR23 over a two-year horizon. Baseline domain score (measured pre-operatively), age, BMI, smoking, TN stage, receptor status, neoadjuvant chemotherapy, axillary surgery and surgery type (breast-conserving, mastectomy, and immediate implant-based reconstruction) were included as predictors. Predictive performance at validation was assessed by the proportion of variance explained (marginal R2; mR2). We included N = 795 patients from Germany for development and N = 623 from Slovenia and N = 417 from Netherlands for validation. The largest proportion of variance was explained by the prediction models for sexual functioning (SF, mR2 35%), physical functioning (PF, mR2 29%), body image (BI, mR2 26%), and cognitive functioning (CF, mR2 25%). The models captured meaningfully different trends over time for different outcomes and surgery types. The predictive performance of the models was largely driven by the baseline domain score. Performance was reasonable at external validation, with r2 values of 19-33% for PF, 10-17% for CF, 15-18% for BI, and 22-28% for SF, although some other outcomes (e.g. breast symptoms and role functioning) showed miscalibration, indicating a need for recalibration. HRQoL after breast cancer surgery can be predicted using simple models with baseline domain scores and surgery type, demonstrating a new opportunity for Patient-Reported Outcome Measures (PROMs) in personalized care.
This study examined the diagnostic power and robustness of sentence repetition (SR) in distinguishing monolingual English-speaking children with developmental language disorder (DLD) from typically developing (TD) children, using a large-scale secondary analysis of the Surrey Communication and Language in Education Study data set. We evaluated SR's effectiveness across different scoring methods in age-matched groups across three different time points. A total of 407 children were included. Bayesian beta regression models compared SR performance between groups across four scoring methods. SR's diagnostic accuracy was assessed using area under the curve (AUC), sensitivity, specificity, and likelihood ratios. Children with DLD consistently performed less accurately than TD peers on SR tasks, with differences persisting across ages 5-11 years and across all scoring methods. AUC values indicated good to excellent discriminative ability, especially in younger groups, though sensitivity and specificity fell below clinical thresholds. These findings support SR's diagnostic potential as a reliable language measure. SR may serve as a valuable tool due to its ease of use and scoring flexibility, though additional assessments are recommended for comprehensive language evaluation. https://doi.org/10.23641/asha.31431517.
This longitudinal study investigated how kindergarten oral language skills predict second-grade reading comprehension in bilinguals with second language (L2) German, as well as the moderating role of word-decoding skills in this relationship. Fifty-nine Russian-German and Turkish-German bilingual children were followed from kindergarten to Grade 2. Oral language skills, including vocabulary and morphosyntax, were measured in kindergarten, and word-decoding speed and sentence- and text-level reading comprehension were assessed in Grade 2. Age in kindergarten, age of onset (AoO) of L2 German, and phonological short-term memory were included as covariates. Generalized mixed-effects models showed that vocabulary significantly predicted sentence and text reading comprehension in Grade 2 after controlling for age, AoO of L2 German, and phonological skills. Morphosyntactic skills were not significant. Moderation analyses revealed that the positive association between early vocabulary and text-level comprehension was more pronounced among children with average to strong word-decoding skills. For sentence-level comprehension, the interaction between word decoding and vocabulary was not significant. Word decoding speed and kindergarten vocabulary emerged as the strongest predictors of later reading comprehension in bilingual children acquiring L2 German, with the impact of vocabulary being more pronounced among children with more developed word-decoding skills. This study highlights the importance of supporting early vocabulary and decoding skills in bilingual children to promote later reading comprehension.
Globally, one billion people suffer from mental health disorders. Migrant populations face high prevalence rates of some disorders and significant barriers in accessing mental healthcare, including language-related barriers. However, knowledge about specific communication difficulties arising from language barriers and mitigation strategies is limited, as is knowledge about country-specific differences. This study explores health and social care providers' (HSCPs') perceptions of mental health service accessibility for migrants, language-related communication difficulties, mitigation strategies and their perceived effectiveness, and the effectiveness of HSCP training in working with migrants. We conducted a cross-sectional survey of HSCPs in nine European countries (n = 629). HSCPs perceive mental health services as largely inaccessible for migrants facing language barriers. Cross-regional comparative analysis identified differences in the frequency of HSCPs' interactions with migrants seeking support for their mental health where language barriers are present and in how often HSCPs' reported experiencing communication difficulties when doing so. HSCPs report a lack of training in communicating with migrants across language barriers, with recent training associated with more positive perceptions of its usefulness. Communication difficulties were encountered throughout the care journey. Informal strategies, such as assistance from family and friends, and machine translation, are commonly used but seen as ineffective. Onsite professional/trained interpreters are deemed most effective, yet their availability is limited. Findings highlight the urgent need for better communication strategies and awareness of the benefits and drawbacks of different strategies to enhance mental health service accessibility for migrants.
Video-mediated listening has been recognized for its potential to facilitate learners' comprehension, engagement, and motivation by providing richer contextual cues and audiovisual input. However, the development of authentic listening assessments typically requires substantial time and financial resources, which hinders their practical implementation. This study aims to address these challenges by applying generative AI to develop a video-mediated listening assessment. Multimodal generative AI (InVideo) was employed to create four videos, while GPT-4 was used to automatically generate 20 test items targeting five listening subskills. A total of 542 university students completed the AI-based video-mediated listening assessment within 30 min, followed by a questionnaire. Additionally, five students and five teachers were selected for semi-structured interviews. Results from psychometric models showed that 19 out of 20 items effectively measured five targeted listening subskills, with appropriate item difficulty and discrimination, confirming construct validity. The questionnaire suggested that the AI-generated videos were adequate in terms of audio, visual, and audio-visual consistency, providing support for the relevance and utility of the test. Questionnaires and semi-structured interviews indicated that video-mediated listening seemed to improve students' listening ability and enhance their interest, motivation, and engagement compared to audio-based methods, while teachers reported that multimodal AI reduced the emotional labor of preparing audiovisual materials in language teaching, supporting the assessment's positive consequences. The fundamental validity considerations of AI-based video-mediated listening assessment and implications for language learning and teaching were discussed.
The rapid emergence of ChatGPT has sparked extensive academic discourse across multiple fields. This study focuses on such discourse within the social sciences by examining how scholars frame and evaluate ChatGPT through research article abstracts. Drawing on 1,227 SSCI-indexed abstracts published between 30 November 2022 and 30 November 2024, we adopt a two-step natural language processing approach. First, we apply topic modeling to identify major thematic patterns in academic discussions of ChatGPT. Then, we perform sentiment analysis to examine how scholars' evaluative attitudes are discursively constructed across these thematic areas. Topic modeling reveals six key themes: artificial intelligence (AI) and technology communication, education and learning tools, user perception and adoption, ethics and academic challenges, human-technology interaction, and computational foundations of Large Language Models (LLMs). Sentiment analysis suggests that approximately 82.97% of abstracts express positive attitudes, particularly regarding ChatGPT's research potential and pedagogical utility, while around 9.78% reflect more cautious or negative views, often focusing on issues such as academic integrity and misinformation. These sentiment patterns appear to vary across thematic areas, with user adoption and education-related topics showing greater positivity, while ethics-oriented discussions exhibit relatively more critical perspectives. By analyzing academic discourse as reflected in research article abstracts, this study contributes a discourse-level perspective on how ChatGPT is framed, endorsed, and critically examined in the social sciences. It offers a data-driven complement to existing conceptual and survey-based investigations and draws attention to both the thematic and evaluative tendencies shaping scholarly narratives around generative AI.
Neural development differs between in-utero and ex-utero environments. Length of gestational age (GA) is associated with brain development and early life neurodevelopmental outcomes, affecting both preterm and term infants. This study aimed to examine a wide range of GA and provide a more comprehensive understanding of its effects on various developmental domains. Four hundred fifty-four infants who were born at 24 to 41 weeks of GA were included in this analysis. Cognitive, language, and motor development between 8 and 30 months of age were assessed using the Bayley Scales of Infant and Toddler Development, Third Edition (Bayley-III). Associations between GA and outcomes were analyzed using linear and logistic mixed-effects models. GA was positively associated with all examined developmental domains with a small-sized effect (Pearson's correlation coefficients: 0.08-0.15; p < 0.05). After adjusting for covariates, linear mixed-effect models estimated that each additional week of GA was associated with an increase in Bayley III composite scores: cognitive (0.6 points), language (0.6 points), and motor (0.62 points). Logistic mixed-effect models showed that after adjusting for the covariates, each additional week of GA reduced the adjusted odds ratio of delay in 1 of the language subdomains (i.e., receptive communication) by 13%. We found a small impact of GA on cognitive, language, and motor development across a wide range of GA. Language and its subdomains seem particularly sensitive to the effects of prematurity. Thus, regular monitoring and parent-based early intervention, especially in the language domain, are warranted for early-term and preterm infants.
While Australia attracts overseas tourists and residents for beach-related leisure and sports, coastal drowning is a serious issue in Australia, with 150 drowning deaths and about 9000 rescues recorded in 2023/2024. Culturally and linguistically diverse (CALD) communities are at high risk due to their unawareness, linguistic barriers, lack of rip current knowledge and limited access to water safety education. To provide further support for CALD communities, this study developed a new pedagogical tool, a reading exercise for standardised English exams such as IELTS, incorporating beach safety information; it examined to what extent this material would improve international students' knowledge of rip currents and safety strategies. This study utilised a quasi-experimental design to measure the improvement of beach safety knowledge using a pre-test, post-test and follow-up test. Statistical data were analysed in SPSS and R Studio, utilising descriptive analysis, and generalised estimate equations. Additionally, a thematic analysis of textual responses was conducted in NVivo. The results show that there was a significant improvement (p < 0.01) in the participants' knowledge of rips, beach flags and safety signage warnings after using the material. Additionally, participants started pointing out a wide range of characteristics when describing rip currents. Although some deterioration of knowledge (except regarding beach flags) was detected 4 weeks later, the improvement was still significant across all topics. So what? Considering the high demand for English language learning material among migrants in Australia, this material should be shared with CALD communities to improve beach safety knowledge.