Electronic health record (EHR) notes are dense medical documents containing large amounts of information, often filled with complex medical jargon. Highlighting all details in EHRs helps reduce the likelihood of missing crucial information by drawing attention to key content. This study proposes the design of a Cardiology Interface Terminology (CIT) to accurately highlight all details in EHR notes of cardiology patients. We introduce an innovative Machine Learning (ML) technique for the design of CIT. The ML technique requires training data. Manual preparation of such training data is time-consuming and expensive. The process of the CIT design includes three phases. In the first two phases, we innovatively derive a training data CIT to be used by the third phase, ML technique. We start by designing an initial CIT, composed of several components: the cardiology-related sub-hierarchies of SNOMED, other SNOMED concepts mined from EHRs of build set, and necessary components of terms e.g., medical abbreviations and medications. Utilizing an iterative process, fine-grained phrases containing initial CIT concepts are extracted from build set as CIT concept candidates. The candidate concep
Biomedical text embeddings have primarily been developed using research literature from PubMed, yet clinical cardiology practice relies heavily on procedural knowledge and specialized terminology found in comprehensive textbooks rather than research abstracts. This research practice gap limits the effectiveness of existing embedding models for clinical applications incardiology. This study trained CardioEmbed, a domain-specialized embedding model based on Qwen3-Embedding-8B, using contrastive learning on a curated corpus of seven comprehensive cardiology textbooks totaling approximately 150,000 sentences after deduplication. The model employs InfoNCE loss with in-batch negatives and achieves 99.60% retrieval accuracy on cardiac-specific semantic retrieval tasks, a +15.94 percentage point improvement over MedTE, the current state-of-the-art medical embedding model. On MTEB medical benchmarks, the model obtained BIOSSES 0.77 Spearman and SciFact 0.61 NDCG@10, indicating competitive performance on related biomedical domains. Domain-specialized training on comprehensive clinical textbooks yields near-perfect cardiology retrieval (99.60% Acc@1), improving over MedTE by +15.94 percentage
Heart diseases remain a leading cause of morbidity and mortality worldwide, necessitating accurate and trustworthy differential diagnosis. However, existing artificial intelligence-based diagnostic methods are often limited by insufficient cardiology knowledge, inadequate support for complex reasoning, and poor interpretability. Here we present HeartAgent, a cardiology-specific agent system designed to support a reliable and explainable differential diagnosis. HeartAgent integrates customized tools and curated data resources and orchestrates multiple specialized sub-agents to perform complex reasoning while generating transparent reasoning trajectories and verifiable supporting references. Evaluated on the MIMIC dataset and a private electronic health records cohort, HeartAgent achieved over 36% and 20% improvements over established comparative methods, in top-3 diagnostic accuracy, respectively. Additionally, clinicians assisted by HeartAgent demonstrated gains of 26.9% in diagnostic accuracy and 22.7% in explanatory quality compared with unaided experts. These results demonstrate that HeartAgent provides reliable, explainable, and clinically actionable decision support for cardio
Domain-specific text embeddings are critical for clinical natural language processing, yet systematic comparisons across model architectures remain limited. This study evaluates ten transformer-based embedding models adapted for cardiology through Low-Rank Adaptation (LoRA) fine-tuning on 106,535 cardiology text pairs derived from authoritative medical textbooks. Results demonstrate that encoder-only architectures, particularly BioLinkBERT, achieve superior domain-specific performance (separation score: 0.510) compared to larger decoder-based models, while requiring significantly fewer computational resources. The findings challenge the assumption that larger language models necessarily produce better domain-specific embeddings and provide practical guidance for clinical NLP system development. All models, training code, and evaluation datasets are publicly available to support reproducible research in medical informatics.
Cardiovascular diseases are becoming increasingly prevalent in modern society, with a profound impact on global health and well-being. These Cardiovascular disorders are complex and multifactorial, influenced by genetic predispositions, lifestyle choices, and diverse socioeconomic and clinical factors. Information about these interrelated factors is dispersed across multiple types of textual data, including patient narratives, medical records, and scientific literature. Natural language processing (NLP) has emerged as a powerful approach for analysing such unstructured data, enabling healthcare professionals and researchers to gain deeper insights that may transform the diagnosis, treatment, and prevention of cardiac disorders. This review provides a comprehensive overview of NLP research in cardiology from 2014 to 2025. We systematically searched six literature databases for studies describing NLP applications across a range of cardiovascular diseases. After a rigorous screening process, we identified 265 relevant articles. Each study was analysed across multiple dimensions, including NLP paradigms, cardiology-related tasks, disease types, and data sources. Our findings reveal sub
This article investigates how translation memories (TM) can be created by translators or other language professionals in order to compile domain-specific parallel corpora , which can then be used in different scenarios, such as machine translation training and fine-tuning, TM leveraging, and/or large language model fine-tuning. The article introduces a semi-automatic TM preparation methodology leveraging primarily translation tools used by translators in favor of data quality and control by the translators. This semi-automatic methodology is then used to build a cardiology-based Turkish -> English corpus from bilingual abstracts of Turkish cardiology journals. The resulting corpus called TRENCARD Corpus has approximately 800,000 source words and 50,000 sentences. Using this methodology, translators can build their custom TMs in a reasonable time and use them in their bilingual data requiring tasks.
The rapidly increasing volume of electronic health record (EHR) data underscores a pressing need to unlock biomedical knowledge from unstructured clinical texts to support advancements in data-driven clinical systems, including patient diagnosis, disease progression monitoring, treatment effects assessment, prediction of future clinical events, etc. While contextualized language models have demonstrated impressive performance improvements for named entity recognition (NER) systems in English corpora, there remains a scarcity of research focused on clinical texts in low-resource languages. To bridge this gap, our study aims to develop multiple deep contextual embedding models to enhance clinical NER in the cardiology domain, as part of the BioASQ MultiCardioNER shared task. We explore the effectiveness of different monolingual and multilingual BERT-based models, trained on general domain text, for extracting disease and medication mentions from clinical case reports written in English, Spanish, and Italian. We achieved an F1-score of 77.88% on Spanish Diseases Recognition (SDR), 92.09% on Spanish Medications Recognition (SMR), 91.74% on English Medications Recognition (EMR), and 88.
In modern collider experiments, the quest to explore fundamental interactions between elementary particles has reached unparalleled levels of precision. Signatures from particle physics detectors are low-level objects (such as energy depositions or tracks) encoding the physics of collisions (the final state particles of hard scattering interactions). The complete simulation of them in a detector is a computational and storage-intensive task. To address this computational bottleneck in particle physics, alternative approaches have been developed, introducing additional assumptions and trade off accuracy for speed.The field has seen a surge in interest in surrogate modeling the detector simulation, fueled by the advancements in deep generative models. These models aim to generate responses that are statistically identical to the observed data. In this paper, we conduct a comprehensive and exhaustive taxonomic review of the existing literature on the simulation of detector signatures from both methodological and application-wise perspectives. Initially, we formulate the problem of detector signature simulation and discuss its different variations that can be unified. Next, we classify
Recently, significant efforts have been made to create Health Digital Twins (HDTs), digital twins for clinical applications. Heart modeling is one of the fastest-growing fields, which favors the effective application of HDTs. The clinical application of HDTs will be increasingly widespread in the future of healthcare services and has a huge potential to form part of the mainstream in medicine. However, it requires the development of both models and algorithms for the analysis of medical data, and advances in Artificial Intelligence (AI) based algorithms have already revolutionized image segmentation processes. Precise segmentation of lesions may contribute to an efficient diagnostics process and a more effective selection of targeted therapy. In this paper, a brief overview of recent achievements in HDT technologies in the field of cardiology, including interventional cardiology was conducted. HDTs were studied taking into account the application of Extended Reality (XR) and AI, as well as data security, technical risks, and ethics-related issues. Special emphasis was put on automatic segmentation issues. It appears that improvements in data processing will focus on automatic segme
The article aims to analyze the performance of ChatGPT, a large language model developed by OpenAI, in the context of cardiology and vascular pathologies. The study evaluated the accuracy of ChatGPT in answering challenging multiple-choice questions (QCM) using a dataset of 190 questions from the Siamois-QCM platform. The goal was to assess ChatGPT potential as a valuable tool in medical education compared to two well-ranked students of medicine. The results showed that ChatGPT outperformed the students, scoring 175 out of 190 correct answers with a percentage of 92.10\%, while the two students achieved scores of 163 and 159 with percentages of 85.78\% and 82.63\%, respectively. These results showcase how ChatGPT has the potential to be highly effective in the fields of cardiology and vascular pathologies by providing accurate answers to relevant questions.
AI coding agents increasingly accept assigned software tasks, modify repositories under bounded authority, and return work packages for review. Prior work proposed the software delegation contract, covering the task, authority, returned work package, and acceptance context, as the unit of analysis for delegated coding work, but did not measure its effects. This paper reports a controlled pilot study of explicit delegation contracts for coding agents. We built a dependency-free TypeScript API task environment with seeded defects and documentation gaps, authored ten tasks across five families, and ran 64 agent executions across two model tiers under three conditions: a realistic issue-style prompt, an explicit delegation contract, and a contract with a required evidence bundle. Each run was scored with hidden acceptance tests, mutation checks, and scope analysis, then reviewed by three independent condition-blinded model-based reviewers using a fixed rubric, for 192 reviews. Explicit contracts did not improve objective task outcomes: all 64 runs passed hidden acceptance checks, with zero scope violations. They did improve reviewability. Evidence sufficiency improved in 22 of 30 paire
Possible topological nature of Kondo and mixed valence insulators has been a recent topic of interest in condensed matter physics. Attention has focused on SmB6, which has long been known to exhibit low temperature transport anomaly, whose origin is of independent interest. We argue that it is possible to resolve the topological nature of surface states by uniquely accessing the surface electronic structure of the low temperature anomalous transport regime through combining state-of-the-art laser- and synchrotron-based angle-resolved photoemission spectroscopy (ARPES) with or without spin resolution. A combination of low temperature and ultra-high resolution (laser) which is lacking in previous ARPES studies of this compound is the key to resolve the possible existence of topological surface state in SmB6. Here we outline an experimental algorithm to systematically explore the topological versus trivial or mixed (topological and trivial surface state admixture as in the first 3D TI Bi$_{1-x}$Sb$_x$) nature of the surface states in Kondo and mixed valence insulators. We conclude based on this methodology that the observed topology of the surface Fermi surface in our low temperature
Emergent and unscheduled cardiology admissions from cardiac catheterization laboratory add complexity to the management of Cardiology and in-patient department. In this article, we sought to study the behavior of cardiology admissions from Catheterization laboratory using time series models. Our research involves retrospective cardiology admission data from March 1, 2012, to November 3, 2016, retrieved from a hospital in Iowa. Autoregressive integrated moving average (ARIMA), Holts method, mean method, naïve method, seasonal naïve, exponential smoothing, and drift method were implemented to forecast weekly cardiology admissions from Catheterization laboratory. ARIMA (2,0,2) (1,1,1) was selected as the best fit model with the minimum sum of error, Akaike information criterion and Schwartz Bayesian criterion. The model failed to reject the null hypothesis of stationarity, it lacked the evidence of independence, and rejected the null hypothesis of normality. The implication of this study will not only improve catheterization laboratory staff schedule, advocate efficient use of imaging equipment and inpatient telemetry beds but also equip management to proactively tackle inpatient over
Natural products, as metabolites from microorganisms, animals, or plants, exhibit diverse biological activities, making them crucial for drug discovery. Nowadays, existing deep learning methods for natural products research primarily rely on supervised learning approaches designed for specific downstream tasks. However, such one-model-for-a-task paradigm often lacks generalizability and leaves significant room for performance improvement. Additionally, existing molecular characterization methods are not well-suited for the unique tasks associated with natural products. To address these limitations, we have pre-trained a foundation model for natural products based on their unique properties. Our approach employs a novel pretraining strategy that is especially tailored to natural products. By incorporating contrastive learning and masked graph learning objectives, we emphasize evolutional information from molecular scaffolds while capturing side-chain information. Our framework achieves state-of-the-art (SOTA) results in various downstream tasks related to natural product mining and drug discovery. We first compare taxonomy classification with synthesized molecule-focused baselines t
The Great Divide in metaphysical debates about laws of nature is between Humeans, who think that laws merely describe the distribution of matter, and non-Humeans, who think that laws govern it. The metaphysics can place demands on the proper formulations of physical theories. It is sometimes assumed that the governing view requires a fundamental / intrinsic direction of time: to govern, laws must be dynamical, producing later states of the world from earlier ones, in accord with the fundamental direction of time in the universe. In this paper, we propose a minimal primitivism about laws of nature (MinP) according to which there is no such requirement. On our view, laws govern by constraining the physical possibilities. Our view captures the essence of the governing view without taking on extraneous commitments about the direction of time or dynamic production. Moreover, as a version of primitivism, our view requires no reduction / analysis of laws in terms of universals, powers, or dispositions. Our view accommodates several potential candidates for fundamental laws, including the principle of least action, the Past Hypothesis, the Einstein equation of general relativity, and even
Upon mechanical loading, granular materials yield and undergo plastic deformation. The nature of plastic deformation is essential for the development of the macroscopic constitutive models and the understanding of shear band formation. However, we still do not fully understand the microscopic nature of plastic deformation in disordered granular materials. Here we used synchrotron X-ray tomography technique to track the structural evolutions of three-dimensional granular materials under shear. We establish that highly distorted coplanar tetrahedra are the structural defects responsible for microscopic plasticity in disordered granular packings. The elementary plastic events occur through flip events which correspond to a neighbor switching process among these coplanar tetrahedra (or equivalently as the rotation motion of 4-ring disclinations). These events are discrete in space and possess specific orientations with the principal stress direction.
We briefly review the various contexts within which one might address the issue of ``why'' the dimensionless constants of Nature have the particular values that they are observed to have. Both the general historical trend, in physics, of replacing a-priori-given, absolute structures by dynamical entities, and anthropic considerations, suggest that coupling ``constants'' have a dynamical nature. This hints at the existence of observable violations of the Equivalence Principle at some level, and motivates the need for improved tests of the Equivalence Principle.
The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.
Planets form and obtain their compositions from the leftover material present in protoplanetary disks of dust and gas surrounding young stars. The chemical make-up of a disk influences every aspect of planetary composition including their overall chemical properties, volatile content, atmospheric composition, and potential for habitability. This Review discusses our knowledge of the chemical and isotopic composition of Solar System materials and how this information can be used to place constraints on the formation pathways of terrestrial planets. We conclude that planetesimal formation by the streaming instability followed by rapid accretion of drifting pebbles within the protoplanetary disk lifetime reproduces most of the chemical and isotopic observables in Solar System. This finding has important implications for planetary habitability beyond the Solar System because in pebble accretion, volatiles important for life are accreted during the main growth phase of rocky planets as opposed to the late-stage. Finally, we explore how bulk chemical inventories and masses of planetary bodies control the composition of their primordial atmospheres and their potential to develop habitable
Increasing demands on medical imaging departments are taking a toll on the radiologist's ability to deliver timely and accurate reports. Recent technological advances in artificial intelligence have demonstrated great potential for automatic radiology report generation (ARRG), sparking an explosion of research. This survey paper conducts a methodological review of contemporary ARRG approaches by way of (i) assessing datasets based on characteristics, such as availability, size, and adoption rate, (ii) examining deep learning training methods, such as contrastive learning and reinforcement learning, (iii) exploring state-of-the-art model architectures, including variations of CNN and transformer models, (iv) outlining techniques integrating clinical knowledge through multimodal inputs and knowledge graphs, and (v) scrutinising current model evaluation techniques, including commonly applied NLP metrics and qualitative clinical reviews. Furthermore, the quantitative results of the reviewed models are analysed, where the top performing models are examined to seek further insights. Finally, potential new directions are highlighted, with the adoption of additional datasets from other rad