Identifying rare disease (RD) patients in electronic health records (EHRs) is difficult, as most of the over 10,000 RDs are not adequately captured by standard coding systems. To address this, we developed a semi-automated workflow to map RDs to SNOMED-CT and ICD-10 codes, enabling improved RD identification across EHR systems. The optimized workflow yielded 88.4% true RD codes in a subset of 1,715 manually curated diseases. Using this workflow and starting with 12,003 GARD IDs mapped to ORPHANET, we obtained 12,081 SNOMED-CT and 357 ICD-10 codes representing 6,342 RDs, organized into 30 ORPHANET linearization classes. We applied these codes to the National COVID Cohort Collaborative (N3C) dataset of over 21 million patients. Among these patients, 8.46 million were identified as COVID-19 positive, of which 4.8 million were used in analyses. Among these, 316,836 (6.55%) had a preexisting RD. Logistic regression, adjusted for age and BMI, revealed that most RD classes were significantly associated with increased odds of severe COVID-19 outcomes. Notably high odds of mortality were observed for rare cardiac (OR = 4.07) and otorhinolaryngologic diseases (OR = 4.00). Hospitalization risk was also elevated across all RD classes, with the highest odds seen in otorhinolaryngologic (OR = 4.31) and endocrine diseases (OR = 3.38). This approach enables scalable RD patient identification in EHRs and highlights the need for tailored healthcare strategies to improve outcomes in RD populations.
Large language models (LLMs) have been evaluated as tools to assist rare disease diagnosis, yet evidence on their accuracy remains fragmented. We conducted a systematic review and meta-analysis to synthesize the available evidence on the diagnostic performance of LLMs, identify sources of heterogeneity, and evaluate the current evidence base for clinical translation. We searched PubMed, Embase, Web of Science, Cochrane Library, arXiv, and medRxiv (January 2020-February 2026). Full-text articles and preprints were considered for inclusion. Eligible studies applied LLM-based systems to generate differential diagnoses for rare diseases and provided Recall@l (R@1; proportion with the correct diagnosis ranked first). We pooled R@1 using Freeman-Tukey double arcsine transformation with DerSimonian-Laird random-effects models. Pre-specified subgroup analyses examined LLM knowledge augmentation strategy and input modality. Because both retained high residual heterogeneity, we conducted a post-hoc exploratory analysis of evaluation benchmark disease composition, mapping diseases from major benchmarks to Orphanet prevalence classifications. Risk of bias was assessed using a modified QUADAS-3 instrument. We identified 902 records, of which 564 were screened and 15 studies were eligible. These 15 studies contributed 19 system-dataset entries to the meta-analysis (total N=39,529 cases). The pooled R@1 was 43·3% (95% CI 35·1-51·6; I 2=99·6%). Augmented LLM systems (agent-based reasoning, retrieval, or fine-tuning; k=8) achieved R@1 of 52·5% (42·0-62·9) versus 35·4% (30·6-40·4) for standalone LLMs (k=ll; p=0·004). Post-hoc exploratory analysis indicated that evaluation benchmark disease composition was associated with differences in diagnostic performance: R@1 was lower on the Phenopacket Store dataset, which contained a higher proportion of ultra-rare diseases (52·8%; k=2), than on RareBench (29·3%; k=6) at 21·7% (18·2-25·5) versus 52·0% (40·7-63·2; p<0·001). All 19 system-dataset entries were assessed to be at high risk of bias, most commonly due to potential data leakage and limited reproducibility. No study provided prospective clinical validation. Diagnostic performance of LLM-based systems for rare diseases varied substantially across evaluation benchmarks. Post-hoc exploratory analysis indicated that performance was associated with benchmark disease composition. Performance was higher in benchmarks containing fewer ultra-rare diseases and in systems incorporating external knowledge at inference time. However, all included studies were at high risk of bias, and none reported prospective clinical validation. These findings highlight the need for prevalence-stratified evaluation benchmarks and independent prospective studies before clinical deployment. This work was supported in part by the National Institutes of Health Common Fund, grant 15-HG-0130 from the National Human Genome Research Institute, U01NS134349 from the National Institute of Neurological Disorders and Stroke, R00LM014429 from the National Library of Medicine, and the Potocsnak Center for Undiagnosed and Rare Disorders.
According to OMIM and Orphanet databases, Schaaf-Yang syndrome (SYS) (OMIM: 615547, ORPHA: 398069) is a rare genetic disorder that shares certain clinical features with Prader-Willi syndrome (PWS), including hypotonia, developmental delay, and early-onset obesity. However, SYS often exhibits a more complex and variable phenotype. Missense variants in MAGEL2 have been reported only rarely, and their phenotypic spectrum appears milder and more variable than that of truncating mutations. Data on early-onset obesity as a dominant feature in such patients are limited. In this case report, we describe a child with mild phenotype (SYS) carrying the novel missense variant MAGEL2(NM_019066.5):c.1265C>T (p.Pro422Leu) presenting with severe early-onset obesity and a comparatively neurodevelopmental phenotype. We present a case of a boy with neonatal hypotonia, diagnosed with (SYS) at age 9 years, with follow-up to age 11 years. The boy was born at 34+3 weeks of gestation with hypotonia, feeding difficulties, and a persistent ductus arteriosus that required surgical ligation in early infancy. In the following years, he developed severe early-onset obesity, already evident by age 2 despite multidisciplinary care. Genetic testing performed at age 9 years identified a novel missense variant (NM_019066.5)c.1265C>T in the MAGEL2 gene, which was not inherited from his mother, thereby confirming the diagnosis of (SYS). At the time of the most recent evaluation, at age 11 years, he remained under long-term follow-up. Clinical management over this period included endocrine therapy, cardiac surgery, physical rehabilitation, and dietary interventions, and despite the complexity of his condition, long-term stabilization of his BMI percentile was achieved with consistent non-pharmacological interventions. This case highlights the importance of early multidisciplinary investigation and intervention in SYS, particularly when obesity is the dominant feature. Effective long-term weight stabilization is possible through structured lifestyle management.
We investigated whether markers, genes or terms of the Human Phenotype Ontology associated with genetic or rare diseases (GARDs) that affect airway or lung function are associated with lung cancer. Genes of interest were extracted from GARD (Genetic and Rare Diseases Information Center), OMIM (Online Mendelian Inheritance in Man®), ORPHANET and Monarch Initiative. Individual SNP, gene level and gene-set analyses were performed for 52,207 SNPs, 1677 genes or for 620 terms of the Human Phenotype Ontology. The analysis included 14,068 lung cancer cases and 12,390 cancer-free control subjects of European descent from the International Lung Cancer Consortium ILCCO. The marker rs56113850 (OR=0.893, 95%CI: 0.862-0.924) was associated with lung cancer (p=1.2x10-10). This marker is located in CYP2A6 as well as in an enhancer region of LTBP4, which is associated with cutis laxa. A suggestive significant association was observed for two markers associated with the DMD gene, which is linked to Duchenne muscular dystrophy. The gene sets "Abnormal circulating adrenocorticotropin concentration" and "Central nervous system neoplasm" were found to be significantly enriched with GARD genes, and can therefore be considered to be associated with lung cancer. Genes associated with genetic and rare lung diseases do not generally appear to carry risk factors for lung cancer. However, genes associated with the hypothalamic-pituitary-adrenal axis show some, but rather weak or complex, associations with lung cancer. Tests at the gene level provide extremely inhomogeneous results, even when applied to the same data. The online version contains supplementary material available at 10.1186/s12885-026-15934-2.
Copy number variations (CNVs) are large structural alterations of the genome that can contribute significantly to the genetic basis of neurodevelopmental and neuropsychiatric conditions, including schizophrenia, autism spectrum disorder, and intellectual disability. Although CNVs are genomically diverse, many result in overlapping clinical features and molecular changes. We present a curated machine readable dataset, CNVPathwayAtlas, that integrates 38 pathogenic CNVs with their genomic coordinates, affected genes, molecular pathways, associated syndromes, and phenotypes. Each CNV is linked to a curated molecular pathway providing mechanistic insight into affected biological functions. This dataset is integrated with external resources including WikiPathways, Orphanet, HGNC, and the Human Phenotype Ontology, and designed for compatibility with bioinformatics workflows. This dataset provides a structured foundation for analyzing the molecular effects of CNVs, and facilitates exploration of shared disorder mechanisms, diagnosis, identification of therapeutic targets, and drug discovery in neurodevelopmental and neuropsychiatric disorders.
暂无摘要(点击查看详情)
暂无摘要(点击查看详情)
暂无摘要(点击查看详情)
暂无摘要(点击查看详情)
Living donor liver transplantation (LDLT) has become an important therapeutic option for children with selected inherited metabolic and genetic cholestatic liver diseases (IM-GCLDs).However, evidence on disease-specific outcomes across different diagnostic categories remains limited, and we therefore conducted a single-center retrospective study with contemporaneous non-IM-GCLD pediatric LDLT recipients as a comparator to better contextualize transplant-related outcomes and disease-specific benefits. Among 21 children with IM-GCLDs, the median follow-up was 21 months; two patients died (one perioperatively from disseminated intravascular coagulation and one at 21 months from pneumonia-related multiorgan failure), and all others are alive with functioning grafts. Disease-specific manifestations, including neuropsychiatric symptoms, portal hypertension, metabolic crises, cholestasis, hyperbilirubinemia, and hyperammonemia, improved or resolved in almost all survivors.At 6 months after LDLT, in children <10 years, mean weight- and height-for-age Z-scores increased from -0.48 to 0.43 and from -0.76 to -0.01; in children ≥10 years, mean height Z-scores increased from -1.49 to -0.53 while BMI Z-scores showed no significant change. Overall survival did not differ significantly between IM-GCLDs and non-IM-GCLD indications. Living donor liver transplantation in children with IM-GCLDs not only improves survival but also confers disease-specific benefits, including recovery of neurologic function, metabolic stabilization, relief of portal hypertension and cholestasis, and catch-up growth. These findings support LDLT as an important therapeutic option for IM-GCLDs, while diagnosis-tailored perioperative assessment and long-term management remain essential given the phenotypic heterogeneity.
暂无摘要(点击查看详情)
Huntington disease (HD) is an inherited neurodegenerative disorder that impairs motor, cognitive, and psychiatric function. Offspring of individuals with HD may experience early caregiving responsibilities, potentially disrupting their educational outcomes. We evaluated the associations between parental age at HD symptom onset, genetic-expansion, sociodemographic, and regional factors with offspring educational attainment outcomes in adulthood. We estimated odds ratios using logistic regression to evaluate associations between higher educational attainment in offspring and parental age at symptom onset, genetic-expansion, race, and region among adults ([Formula: see text]18 years) in the Enroll-HD study. To assess the relative importance of exposures in predicting educational outcomes, we fit a random forest model and ranked these based on mean decrease in accuracy. In our explorative analysis, participants whose parents had an earlier age at HD symptom onset were associated with lower odds of attaining a higher education. We also identified a nonlinear, inverted-U association between genetic-expansion and the probability of higher educational attainment–a pattern that has been observed in prior studies of neurocognitive function in children. Marked differences were also observed by race and region: Black, Hispanic/Latino, and Native American participants were associated with lower odds of higher education compared with White participants, and those residing outside Northern America were associated with lower odds of higher educational attainment. Earlier parental HD onset was associated with lower educational attainment in offspring and disparities were observed across genetic-expansion, sociodemographic, and regional groups. Our exploratory findings may inform future studies aimed at better understanding educational inequities among families affected by HD and related neurodegenerative disorders. The online version contains supplementary material available at 10.1186/s13023-026-04336-z.
暂无摘要(点击查看详情)
暂无摘要(点击查看详情)
暂无摘要(点击查看详情)
Costello syndrome (CS) is a rare genetic disorder within the spectrum of RASopathies, caused by activating mutations in the HRAS gene, leading to constitutive dysregulation of the RAS/MAPK signalling pathway. Among its multisystemic manifestations, a distinctive musculoskeletal involvement is frequently observed with reduction in muscle force, pain of musculoskeletal origin and muscular hypotrophy. Moreover, abnormal histological findings with variability in size, atrophy, and prevalence of type 2-fibers have been detected on anecdotal muscular biopsies of affected individuals. We conducted a monocentric study on 20 individuals (13 females, 7 males; median age 19 years) with molecularly confirmed CS diagnosis, recruited between December 2019 and December 2022. Muscle ultrasonography (US) was performed to study muscle architecture and detect fibroadipose infiltration (FAI), a key determinant of muscle quality and functional performance on lumbar paravertebral, quadriceps (vastus medialis, vastus lateralis, rectus femoris), and gastrocnemius muscles. FAI was graded according to the Heckmatt scale (I-IV). Nutritional and metabolic assessments were conducted including macronutrient intake (3 days diet recall) and resting energy expenditure (REE). To investigate the pathogenic contribution of HRAS dysregulation to skeletal muscle development, preclinical studies were performed in engineered mouse myoblasts expressing HRAS p.Gly12Ser and p.Gly13Cys variants. FAI was detected in 100% of participants in at least one muscle, most commonly in lumbar paravertebral muscles (85%) and vastus lateralis (70-75%); no grade IV involution cases occurred. FAI severity showed no association with age, physical activity, nutritional/biochemical parameters, or REE, but displayed district‑specific correlations with skeletal anomalies (tight Achilles tendon and coxa valga). In vitro, myoblasts expressing HRAS mutants exhibited increased proliferation, impaired myogenic differentiation/fusion, and lipid droplet accumulation with elevated cholesteryl esters compared with controls. US shows potential as a non-invasive tool for assessing and monitoring muscular health in CS and related RASopathies, though further data are needed to support its use. The pattern supports HRAS‑mediated metabolic dysfunction as a primary driver of FAI, warranting multicenter longitudinal studies with standardized quantitative metrics integrated with functional testing in the view of future therapeutic trials.
暂无摘要(点击查看详情)
暂无摘要(点击查看详情)
暂无摘要(点击查看详情)
Niemann-Pick disease type B (NPD-B) is a rare lysosomal storage disorder characterized by residual activity of acid sphingomyelinase (ASM). While functional inhibitors of ASM (FIASMAs) are widely prescribed as psychotropic medications, they may pose a particular risk to patients with NPD-B by further reducing the already impaired enzymatic function. Here, we report the case of a 20-year-old male with genetically confirmed NPD-B who experienced rapid clinical deterioration following the administration of zuclopenthixol, a drug not previously associated with FIASMA activity. Within 48 hours of treatment initiation, the patient developed profound lethargy and markedly elevated creatine kinase (CK) levels of up to 22,000 U/L, consistent with rhabdomyolysis. Symptoms resolved quickly after discontinuation of zuclopenthixol. In vitro experiments using a radioactive [1 4 C]-sphingomyelin assay in Jurkat cells demonstrated that zuclopenthixol dose-dependently inhibited ASM activity by up to 71.5%. Zuclopenthixol had not previously been recognized as a FIASMA and might therefore have been considered a rational choice for treating patients with NPD-B. Our findings challenge this assumption by identifying zuclopenthixol as a potent inhibitor of ASM activity. This novel insight is of high clinical relevance, given the frequent use of antipsychotics in the management of neuropsychiatric symptoms in lysosomal storage disorders. We propose that zuclopenthixol and other potential FIASMAs be carefully re-evaluated for use in this vulnerable patient population.