共找到 20 条结果
Automated stuttering detection (ASD) systems struggle with paediatric speech due to high acoustic variability in developing voices and the subtle distinction between pathological stuttering and typical developmental disfluencies. We introduce Paediatric-HGNN, a framework using a Context-aware Part-whole Interaction Network (CaPIN) tailored for paediatric data. Instead of conventional 1D signal modelling, our approach builds a heterogeneous graph capturing hierarchical relationships between lexical units (word nodes) and fine-grained acoustic segments (frame nodes). Trained on curated paediatric corpora (UCLASS and FluencyBank), Paediatric-HGNN achieves 82.4% weighted accuracy and a Typical Disfluency F1-score of 0.386. Modelling hierarchical lexical-acoustic interactions captures developmental "searching" behaviour, offering a more robust and interpretable tool for early clinical intervention.
Thanks to the rapidly evolving integration of LLMs into decision-support tools, a significant transformation is happening across large-scale systems. Like other medical fields, the use of LLMs such as GPT-4 is gaining increasing interest in radiation oncology as well. An attempt to assess GPT-4's performance in radiation oncology was made via a dedicated 100-question examination on the highly specialized topic of radiation oncology physics, revealing GPT-4's superiority over other LLMs. GPT-4's performance on a broader field of clinical radiation oncology is further benchmarked by the ACR Radiation Oncology In-Training (TXIT) exam where GPT-4 achieved a high accuracy of 74.57%. Its performance on re-labelling structure names in accordance with the AAPM TG-263 report has also been benchmarked, achieving above 96% accuracies. Such studies shed light on the potential of LLMs in radiation oncology. As interest in the potential and constraints of LLMs in general healthcare applications continues to rise5, the capabilities and limitations of LLMs in radiation oncology decision support have not yet been fully explored.
Paediatric obstructive sleep apnoea (OSA) is clinically significant yet difficult to diagnose, as children poorly tolerate sensor-based polysomnography. Acoustic monitoring provides a non-invasive alternative for home-based OSA screening, but limited paediatric data hinders the development of robust deep learning approaches. This paper proposes a transfer learning framework that adapts acoustic models pretrained on adult sleep data to paediatric OSA detection, incorporating SpO2-based desaturation patterns to enhance model training. Using a large adult sleep dataset (157 nights) and a smaller paediatric dataset (15 nights), we systematically evaluate (i) single- versus multi-task learning, (ii) encoder freezing versus full fine-tuning, and (iii) the impact of delaying SpO2 labels to better align them with the acoustics and capture physiologically meaningful features. Results show that fine-tuning with SpO2 integration consistently improves paediatric OSA detection compared with baseline models without adaptation. These findings demonstrate the feasibility of transfer learning for home-based OSA screening in children and offer its potential clinical value for early diagnosis.
Mathematical oncology is an interdisciplinary research field where the mathematical sciences meet cancer research. Being situated at the intersection of these two fields makes mathematical oncology highly dynamic, as practicing researchers are incentivised to quickly adapt to both technical and medical research advances. Determining the scope of mathematical oncology is therefore not straightforward; however, it is important for purposes related to funding allocation, education, scientific communication, and community organisation. To address this issue, we here conduct a bibliometric analysis of mathematical oncology. We compare our results to the broader field of mathematical biology, and position our findings within theoretical science of science frameworks. Based on article metadata and citation flows, our results provide evidence that mathematical oncology has undergone a significant evolution since the 1960s marked by increased interactions with other disciplines, geographical expansion, larger research teams, and greater diversity in studied topics. The latter finding contributes to the greater discussion on which models different research communities consider to be valuable
Traditional health authority approval for oncology drugs is based on a clinical benefit endpoint, or a valid surrogate. In 1992 the FDA created the Accelerated Approval pathway to allow for earlier approval of therapies in serious conditions with an unmet medical need. This is accomplished typically by granting accelerated approval based on a surrogate endpoint that can be measured earlier than a traditional approval endpoint. Minimal residual disease (MRD) is a sensitive measure of residual cancer cells in hematology oncology after treatment, and is increasingly considered as a secondary or exploratory endpoint due to its prognostic potential for traditional clinical trial endpoints such as progression-free survival (PFS) and overall survival (OS). This work aims to evaluate MRD's surrogacy potential across several hematologic cancer indications while keeping the focus on follicular lymphoma (FL), using data from published studies. We examine individual-level and trial-level correlations extracted from previously published studies to elucidate the potential role of MRD in accelerating the drug approval process in hematology oncology trials.
Electronic Patient Record (EPR) systems contain valuable clinical information, but much of it is trapped in unstructured text, limiting its use for research and decision-making. Large language models can extract such information but require substantial computational resources to run locally, and sending sensitive clinical data to cloud-based services, even when deidentified, raises significant patient privacy concerns. In this study, we develop a resource-efficient semi-automated annotation workflow using small language models (SLMs) to extract structured information from unstructured EPR data, focusing on paediatric histopathology reports. As a proof-of-concept, we apply the workflow to paediatric renal biopsy reports, a domain chosen for its constrained diagnostic scope and well-defined underlying biology. We develop the workflow iteratively with clinical oversight across three meetings, manually annotating 400 reports from a dataset of 2,111 at Great Ormond Street Hospital as a gold standard, while developing an automated information extraction approach using SLMs. We frame extraction as a Question-Answering task grounded by clinician-guided entity guidelines and few-shot exampl
The application of AI in oncology has been limited by its reliance on large, annotated datasets and the need for retraining models for domain-specific diagnostic tasks. Taking heed of these limitations, we investigated in-context learning as a pragmatic alternative to model retraining by allowing models to adapt to new diagnostic tasks using only a few labeled examples at inference, without the need for retraining. Using four vision-language models (VLMs)-Paligemma, CLIP, ALIGN and GPT-4o, we evaluated the performance across three oncology datasets: MHIST, PatchCamelyon and HAM10000. To the best of our knowledge, this is the first study to compare the performance of multiple VLMs on different oncology classification tasks. Without any parameter updates, all models showed significant gains with few-shot prompting, with GPT-4o reaching an F1 score of 0.81 in binary classification and 0.60 in multi-class classification settings. While these results remain below the ceiling of fully fine-tuned systems, they highlight the potential of ICL to approximate task-specific behavior using only a handful of examples, reflecting how clinicians often reason from prior cases. Notably, open-source
Unstructured notes within the electronic health record (EHR) contain rich clinical information vital for cancer treatment decision making and research, yet reliably extracting structured oncology data remains challenging due to extensive variability, specialized terminology, and inconsistent document formats. Manual abstraction, although accurate, is prohibitively costly and unscalable. Existing automated approaches typically address narrow scenarios - either using synthetic datasets, restricting focus to document-level extraction, or isolating specific clinical variables (e.g., staging, biomarkers, histology) - and do not adequately handle patient-level synthesis across the large number of clinical documents containing contradictory information. In this study, we propose an agentic framework that systematically decomposes complex oncology data extraction into modular, adaptive tasks. Specifically, we use large language models (LLMs) as reasoning agents, equipped with context-sensitive retrieval and iterative synthesis capabilities, to exhaustively and comprehensively extract structured clinical variables from real-world oncology notes. Evaluated on a large-scale dataset of over 40
Paediatric kidney disease varies widely in its presentation and progression, which calls for continuous monitoring of renal function. Using electronic health records collected between 2019 and 2025 at Great Ormond Street Hospital, a leading UK paediatric hospital, we explored a temporal modelling approach that integrates longitudinal laboratory sequences with demographic information. A recurrent neural model trained on these data was used to predict whether a child would record an abnormal serum creatinine value within the following thirty days. Framed as a pilot study, this work provides an initial demonstration that simple temporal representations can capture useful patterns in routine paediatric data and lays the groundwork for future multimodal extensions using additional clinical signals and more detailed renal outcomes.
Clinical oncology generates vast, unstructured data that often contain inconsistencies, missing information, and ambiguities, making it difficult to extract reliable insights for data-driven decision-making. General-purpose large language models (LLMs) struggle with these challenges due to their lack of domain-specific reasoning, including specialized clinical terminology, context-dependent interpretations, and multi-modal data integration. We address these issues with an oncology-specialized, efficient, and adaptable NLP framework that combines instruction tuning, retrieval-augmented generation (RAG), and graph-based knowledge integration. Our lightweight models prove effective at oncology-specific tasks, such as named entity recognition (e.g., identifying cancer diagnoses), entity linking (e.g., linking entities to standardized ontologies), TNM staging, document classification (e.g., cancer subtype classification from pathology reports), and treatment response prediction. Our framework emphasizes adaptability and resource efficiency. We include minimal German instructions, collected at the University Hospital Zurich (USZ), to test whether small amounts of non-English language dat
Paediatric Acute Myeloid Leukemia is a complex adaptive ecosystem with high morbidity. Current trajectory inference algorithms struggle to predict causal dynamics in AML progression, including relapse and recurrence risk. We propose a symbolic AI and deep learning framework grounded in complexity science, integrating Recurrent Neural Networks, Transformers, and Algorithmic Information Dynamics to model longitudinal single cell transcriptomics and infer complex state transitions in paediatric AML. We identify key plasticity markers as predictive signatures regulating developmental trajectories. These were derived by integrating deep learning with complex systems based network perturbation analysis and dynamical systems theory to infer high dimensional state space attractors steering AML evolution. Findings reveal dysregulated epigenetic and developmental patterning, with AML cells in maladaptive, reprogrammable plastic states, i.e., developmental arrest blocking terminal differentiation. Predictions forecast neurodevelopmental and morphogenetic signatures guiding AML cell fate bifurcations, suggesting ectoderm mesoderm crosstalk during disrupted differentiation. Neuroplasticity and
Cancer evolves continuously over time through a complex interplay of genetic, epigenetic, microenvironmental, and phenotypic changes. This dynamic behavior drives uncontrolled cell growth, metastasis, immune evasion, and therapy resistance, posing challenges for effective monitoring and treatment. However, today's data-driven research in oncology has primarily focused on cross-sectional analysis using data from a single modality, limiting the ability to fully characterize and interpret the disease's dynamic heterogeneity. Advances in multiscale data collection and computational methods now enable the discovery of longitudinal multimodal biomarkers for precision oncology. Longitudinal data reveal patterns of disease progression and treatment response that are not evident from single-timepoint data, enabling timely abnormality detection and dynamic treatment adaptation. Multimodal data integration offers complementary information from diverse sources for more precise risk assessment and targeting of cancer therapy. In this review, we survey methods of longitudinal and multimodal modeling, highlighting their synergy in providing multifaceted insights for personalized care tailored to
Background: Lung cancer ranks as the leading cause of cancer-related mortality worldwide. The complexity of tumor delineation, crucial for radiation therapy, requires expertise often unavailable in resource-limited settings. Artificial Intelligence(AI), particularly with advancements in deep learning (DL) and natural language processing (NLP), offers potential solutions yet is challenged by high false positive rates. Purpose: The Oncology Contouring Copilot (OCC) system is developed to leverage oncologist expertise for precise tumor contouring using textual descriptions, aiming to increase the efficiency of oncological workflows by combining the strengths of AI with human oversight. Methods: Our OCC system initially identifies nodule candidates from CT scans. Employing Language Vision Models (LVMs) like GPT-4V, OCC then effectively reduces false positives with clinical descriptive texts, merging textual and visual data to automate tumor delineation, designed to elevate the quality of oncology care by incorporating knowledge from experienced domain experts. Results: Deployments of the OCC system resulted in a significant reduction in the false discovery rate by 35.0%, a 72.4% decrea
Pneumonia remains one of the leading causes of death among children worldwide, underscoring a critical need for fast and accurate diagnostic tools. In this paper, we propose an interpretable deep learning model on Residual Networks (ResNets) for automatically diagnosing paediatric pneumonia on chest X-rays. We enhance interpretability through Bayesian Gradient-weighted Class Activation Mapping (BayesGrad-CAM), which quantifies uncertainty in visual explanations, and which offers spatial locations accountable for the decision-making process of the model. Our ResNet-50 model, trained on a large paediatric chest X-rays dataset, achieves high classification accuracy (95.94%), AUC-ROC (98.91%), and Cohen's Kappa (0.913), accompanied by clinically meaningful visual explanations. Our findings demonstrate that high performance and interpretability are not only achievable but critical for clinical AI deployment.
Multimodal Large Language Models (LLMs) hold promise for biomedical reasoning, but current benchmarks fail to capture the complexity of real-world clinical workflows. Existing evaluations primarily assess unimodal, decontextualized question-answering, overlooking multi-agent decision-making environments such as Molecular Tumor Boards (MTBs). MTBs bring together diverse experts in oncology, where diagnostic and prognostic tasks require integrating heterogeneous data and evolving insights over time. Current benchmarks lack this longitudinal and multimodal complexity. We introduce MTBBench, an agentic benchmark simulating MTB-style decision-making through clinically challenging, multimodal, and longitudinal oncology questions. Ground truth annotations are validated by clinicians via a co-developed app, ensuring clinical relevance. We benchmark multiple open and closed-source LLMs and show that, even at scale, they lack reliability -- frequently hallucinating, struggling with reasoning from time-resolved data, and failing to reconcile conflicting evidence or different modalities. To address these limitations, MTBBench goes beyond benchmarking by providing an agentic framework with foun
MRI is preferred over CT in paediatric imaging because it avoids ionising radiation, but its use in spine deformity assessment is largely limited by the lack of automated, high-resolution 3D bony reconstruction, which continues to rely on CT. MRI-based 3D reconstruction remains impractical due to manual workflows and the scarcity of labelled full-spine datasets. This study introduces an AI framework that enables fully automated thoracolumbar spine (T1-L5) segmentation and 3D reconstruction from MRI alone. Historical low-dose CT scans from adolescent idiopathic scoliosis (AIS) patients were converted into MRI-like images using a GAN and combined with existing labelled thoracic MRI data to train a U-Net-based model. The resulting algorithm accurately generated continuous thoracolumbar 3D reconstructions, improved segmentation accuracy (88% Dice score), and reduced processing time from approximately 1 hour to under one minute, while preserving AIS-specific deformity features. This approach enables radiation-free 3D deformity assessment from MRI, supporting clinical evaluation, surgical planning, and navigation in paediatric spine care.
In the past year, there has been a growing trend in applying Large Language Models (LLMs) to the field of medicine, particularly with the advent of advanced language models such as ChatGPT developed by OpenAI. However, there is limited research on LLMs specifically addressing oncology-related queries. The primary aim of this research was to develop a specialized language model that demonstrates improved accuracy in providing advice related to oncology. We performed an extensive data collection of online question-answer interactions centered around oncology, sourced from reputable doctor-patient platforms. Following data cleaning and anonymization, a dataset comprising over 180K+ oncology-related conversations was established. The conversations were categorized and meticulously reviewed by field specialists and clinicians to ensure precision. Employing the LLaMA model and other selected open-source datasets, we conducted iterative fine-tuning to enhance the model's proficiency in basic medical conversation and specialized oncology knowledge. We observed a substantial enhancement in the model's understanding of genuine patient inquiries and its reliability in offering oncology-relate
Randomized discontinuation design (RDD) is an enrichment strategy commonly used to address limitations of traditional placebo-controlled trials, particularly the ethical concern of prolonged placebo exposure. RDD consists of two phases: an initial open-label phase in which all eligible patients receive the investigational medicinal product (IMP), followed by a double-blind phase in which responders are randomized to continue with the IMP or switch to placebo. This design tests whether the IMP provides benefit beyond the placebo effect. The estimand framework introduced in ICH E9(R1) strengthens the dialogue among clinical research stakeholders by clarifying trial objectives and aligning them with appropriate statistical analyses. However, its application in oncology trials using RDD remains unclear. This manuscript uses the phase III JAVELIN Gastric 100 trial and the phase II trial of sorafenib (BAY 43-9006) as case studies to propose an estimand framework tailored for oncology trials employing RDD in phase III and phase II settings, respectively. We highlight some similarities and differences between RDDs and traditional randomized controlled trials in the context of ICH E9(R1). T
Both medical care and observational studies in oncology require a thorough understanding of a patient's disease progression and treatment history, often elaborately documented in clinical notes. Despite their vital role, no current oncology information representation and annotation schema fully encapsulates the diversity of information recorded within these notes. Although large language models (LLMs) have recently exhibited impressive performance on various medical natural language processing tasks, due to the current lack of comprehensively annotated oncology datasets, an extensive evaluation of LLMs in extracting and reasoning with the complex rhetoric in oncology notes remains understudied. We developed a detailed schema for annotating textual oncology information, encompassing patient characteristics, tumor characteristics, tests, treatments, and temporality. Using a corpus of 40 de-identified breast and pancreatic cancer progress notes at University of California, San Francisco, we applied this schema to assess the zero-shot abilities of three recent LLMs (GPT-4, GPT-3.5-turbo, and FLAN-UL2) to extract detailed oncological history from two narrative sections of clinical progr
Personalized oncology aims to tailor treatment strategies to the unique molecular and clinical profiles of individual patients, moving beyond the traditional paradigm of treating the disease not the patient. Achieving this vision requires the integration and interpretation of vast, heterogeneous biomedical data within a meaningful scientific framework. Knowledge graphs, structured according to biomedical ontologies, offer a powerful approach to contextualize and interconnect diverse datasets, enabling more precise and informed clinical decision-making. We present ECKO (Explainable Clinical Knowledge for Oncology), a comprehensive knowledge graph that integrates 33 biomedical ontologies and aggregates data from multiple studies to create a unified resource optimized for data-driven clinical applications in oncology. Designed to support personalized drug recommendations, ECKO facilitates the identification of optimal therapeutic options by linking patient-specific molecular data to relevant pharmacological knowledge. It provides transparent, interpretable explanations for drug recommendations, fostering greater trust and understanding among clinicians and researchers. This resource r