Electronic health record (EHR) notes are dense medical documents containing large amounts of information, often filled with complex medical jargon. Highlighting all details in EHRs helps reduce the likelihood of missing crucial information by drawing attention to key content. This study proposes the design of a Cardiology Interface Terminology (CIT) to accurately highlight all details in EHR notes of cardiology patients. We introduce an innovative Machine Learning (ML) technique for the design of CIT. The ML technique requires training data. Manual preparation of such training data is time-consuming and expensive. The process of the CIT design includes three phases. In the first two phases, we innovatively derive a training data CIT to be used by the third phase, ML technique. We start by designing an initial CIT, composed of several components: the cardiology-related sub-hierarchies of SNOMED, other SNOMED concepts mined from EHRs of build set, and necessary components of terms e.g., medical abbreviations and medications. Utilizing an iterative process, fine-grained phrases containing initial CIT concepts are extracted from build set as CIT concept candidates. The candidate concep
Biomedical text embeddings have primarily been developed using research literature from PubMed, yet clinical cardiology practice relies heavily on procedural knowledge and specialized terminology found in comprehensive textbooks rather than research abstracts. This research practice gap limits the effectiveness of existing embedding models for clinical applications incardiology. This study trained CardioEmbed, a domain-specialized embedding model based on Qwen3-Embedding-8B, using contrastive learning on a curated corpus of seven comprehensive cardiology textbooks totaling approximately 150,000 sentences after deduplication. The model employs InfoNCE loss with in-batch negatives and achieves 99.60% retrieval accuracy on cardiac-specific semantic retrieval tasks, a +15.94 percentage point improvement over MedTE, the current state-of-the-art medical embedding model. On MTEB medical benchmarks, the model obtained BIOSSES 0.77 Spearman and SciFact 0.61 NDCG@10, indicating competitive performance on related biomedical domains. Domain-specialized training on comprehensive clinical textbooks yields near-perfect cardiology retrieval (99.60% Acc@1), improving over MedTE by +15.94 percentage
Heart diseases remain a leading cause of morbidity and mortality worldwide, necessitating accurate and trustworthy differential diagnosis. However, existing artificial intelligence-based diagnostic methods are often limited by insufficient cardiology knowledge, inadequate support for complex reasoning, and poor interpretability. Here we present HeartAgent, a cardiology-specific agent system designed to support a reliable and explainable differential diagnosis. HeartAgent integrates customized tools and curated data resources and orchestrates multiple specialized sub-agents to perform complex reasoning while generating transparent reasoning trajectories and verifiable supporting references. Evaluated on the MIMIC dataset and a private electronic health records cohort, HeartAgent achieved over 36% and 20% improvements over established comparative methods, in top-3 diagnostic accuracy, respectively. Additionally, clinicians assisted by HeartAgent demonstrated gains of 26.9% in diagnostic accuracy and 22.7% in explanatory quality compared with unaided experts. These results demonstrate that HeartAgent provides reliable, explainable, and clinically actionable decision support for cardio
The rapidly increasing volume of electronic health record (EHR) data underscores a pressing need to unlock biomedical knowledge from unstructured clinical texts to support advancements in data-driven clinical systems, including patient diagnosis, disease progression monitoring, treatment effects assessment, prediction of future clinical events, etc. While contextualized language models have demonstrated impressive performance improvements for named entity recognition (NER) systems in English corpora, there remains a scarcity of research focused on clinical texts in low-resource languages. To bridge this gap, our study aims to develop multiple deep contextual embedding models to enhance clinical NER in the cardiology domain, as part of the BioASQ MultiCardioNER shared task. We explore the effectiveness of different monolingual and multilingual BERT-based models, trained on general domain text, for extracting disease and medication mentions from clinical case reports written in English, Spanish, and Italian. We achieved an F1-score of 77.88% on Spanish Diseases Recognition (SDR), 92.09% on Spanish Medications Recognition (SMR), 91.74% on English Medications Recognition (EMR), and 88.
Cardiovascular diseases are becoming increasingly prevalent in modern society, with a profound impact on global health and well-being. These Cardiovascular disorders are complex and multifactorial, influenced by genetic predispositions, lifestyle choices, and diverse socioeconomic and clinical factors. Information about these interrelated factors is dispersed across multiple types of textual data, including patient narratives, medical records, and scientific literature. Natural language processing (NLP) has emerged as a powerful approach for analysing such unstructured data, enabling healthcare professionals and researchers to gain deeper insights that may transform the diagnosis, treatment, and prevention of cardiac disorders. This review provides a comprehensive overview of NLP research in cardiology from 2014 to 2025. We systematically searched six literature databases for studies describing NLP applications across a range of cardiovascular diseases. After a rigorous screening process, we identified 265 relevant articles. Each study was analysed across multiple dimensions, including NLP paradigms, cardiology-related tasks, disease types, and data sources. Our findings reveal sub
Recently, significant efforts have been made to create Health Digital Twins (HDTs), digital twins for clinical applications. Heart modeling is one of the fastest-growing fields, which favors the effective application of HDTs. The clinical application of HDTs will be increasingly widespread in the future of healthcare services and has a huge potential to form part of the mainstream in medicine. However, it requires the development of both models and algorithms for the analysis of medical data, and advances in Artificial Intelligence (AI) based algorithms have already revolutionized image segmentation processes. Precise segmentation of lesions may contribute to an efficient diagnostics process and a more effective selection of targeted therapy. In this paper, a brief overview of recent achievements in HDT technologies in the field of cardiology, including interventional cardiology was conducted. HDTs were studied taking into account the application of Extended Reality (XR) and AI, as well as data security, technical risks, and ethics-related issues. Special emphasis was put on automatic segmentation issues. It appears that improvements in data processing will focus on automatic segme
This article investigates how translation memories (TM) can be created by translators or other language professionals in order to compile domain-specific parallel corpora , which can then be used in different scenarios, such as machine translation training and fine-tuning, TM leveraging, and/or large language model fine-tuning. The article introduces a semi-automatic TM preparation methodology leveraging primarily translation tools used by translators in favor of data quality and control by the translators. This semi-automatic methodology is then used to build a cardiology-based Turkish -> English corpus from bilingual abstracts of Turkish cardiology journals. The resulting corpus called TRENCARD Corpus has approximately 800,000 source words and 50,000 sentences. Using this methodology, translators can build their custom TMs in a reasonable time and use them in their bilingual data requiring tasks.
Domain-specific text embeddings are critical for clinical natural language processing, yet systematic comparisons across model architectures remain limited. This study evaluates ten transformer-based embedding models adapted for cardiology through Low-Rank Adaptation (LoRA) fine-tuning on 106,535 cardiology text pairs derived from authoritative medical textbooks. Results demonstrate that encoder-only architectures, particularly BioLinkBERT, achieve superior domain-specific performance (separation score: 0.510) compared to larger decoder-based models, while requiring significantly fewer computational resources. The findings challenge the assumption that larger language models necessarily produce better domain-specific embeddings and provide practical guidance for clinical NLP system development. All models, training code, and evaluation datasets are publicly available to support reproducible research in medical informatics.
We investigate the effectiveness of fine-tuning large language models (LLMs) on small medical datasets for text classification and named entity recognition tasks. Using a German cardiology report dataset and the i2b2 Smoking Challenge dataset, we demonstrate that fine-tuning small LLMs locally on limited training data can improve performance achieving comparable results to larger models. Our experiments show that fine-tuning improves performance on both tasks, with notable gains observed with as few as 200-300 training examples. Overall, the study highlights the potential of task-specific fine-tuning of LLMs for automating clinical workflows and efficiently extracting structured data from unstructured medical text.
The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.
Spin-momentum locked (SML) topological surface state (TSS) provides exotic properties for spintronics applications. The spin-polarized current, which emerges owing to the SML, can be directly detected by performing spin potentiometric measurement. We observed spin-polarized current using a bulk insulating topological insulator (TI), Bi1.5Sb0.5Te1.7Se1.3, and Co as the ferromagnetic spin probe. The spin voltage was probed with varying the bias current, temperature, and gate voltage. Moreover, we observed non-local spin-polarized current, which is regarded as a distinguishing property of TIs. The spin-polarization ratio of the non-local current was larger than that of the local current. These findings could reveal a more accurate approach to determine spin-polarization ratio at the TSS.
This paper presents an in-depth analysis of the performance of seven different Large Language Models (LLMs) in solving a diverse set of math advanced calculus problems. The study aims to evaluate these models' accuracy, reliability, and problem-solving capabilities, including ChatGPT 4o, Gemini Advanced with 1.5 Pro, Copilot Pro, Claude 3.5 Sonnet, Meta AI, Mistral AI, and Perplexity. The assessment was conducted through a series of thirty-two test problems, encompassing a total of 320 points. The problems covered various topics, from vector calculations and geometric interpretations to integral evaluations and optimization tasks. The results highlight significant trends and patterns in the models' performance, revealing both their strengths and weaknesses - for instance, models like ChatGPT 4o and Mistral AI demonstrated consistent accuracy across various problem types, indicating their robustness and reliability in mathematical problem-solving, while models such as Gemini Advanced with 1.5 Pro and Meta AI exhibited specific weaknesses, particularly in complex problems involving integrals and optimization, suggesting areas for targeted improvements. The study also underscores the
For an electron, a spin-1/2 particle, the spin charge $\mathbf{s}$, a real pseudovector with constant length, could determine the spin polarization properties in quantum mechanics. Since spin density $ρ_{\mathbf{s}}$ could be expressed as the product of probability density $ρ$ and the spin charge $\mathbf{s}$, the spin continuity equation could be derived from fundamental principles of quantum mechanics and the definition of spin current arise naturally. Only in some specific conditions the conventional definition of spin current coincides with ours. The equilibrium spin currents would vanish automatically in two-dimensional electron gas with Rashba spin-orbit interaction by using our definition of spin current.
Use real word data to evaluate the performance of the electrocardiographic markers of GEH as features in a machine learning model with Standard ECG features and Risk Factors in Predicting Outcome of patients in a population referred to a tertiary cardiology hospital. Patients forwarded to specific evaluation in a cardiology specialized hospital performed an ECG and a risk factor anamnesis. A series of follow up attendances occurred in periods of 6 months, 12 months and 15 months to check for cardiovascular related events (mortality or new nonfatal cardiovascular events (Stroke, MI, PCI, CS), as identified during 1-year phone follow-ups. The first attendance ECG was measured by a specialist and processed in order to obtain the global electric heterogeneity (GEH) using the Kors Matriz. The ECG measurements, GEH parameters and risk factors were combined for training multiple instances of XGBoost decision trees models. Each instance were optmized for the AUCPR and the instance with higher AUC is chosen as representative to the model. The importance of each parameter for the winner tree model was compared to better understand the improvement from using GEH parameters. The GEH parameters
Electronic health records contain rich textual data which possess critical predictive information for machine-learning based diagnostic aids. However many traditional machine learning methods fail to simultaneously integrate both vector space data and text. We present a supervised method using Laplacian eigenmaps to augment existing machine-learning methods with low-dimensional representations of textual predictors which preserve the local similarities. The proposed implementation performs alternating optimization using gradient descent. For the evaluation we applied our method to over 2,000 patient records from a large single-center pediatric cardiology practice to predict if patients were diagnosed with cardiac disease. Our method was compared with latent semantic indexing, latent Dirichlet allocation, and local Fisher discriminant analysis. The results were assessed using AUC, MCC, specificity, and sensitivity. Results indicate supervised Laplacian eigenmaps was the highest performing method in our study, achieving 0.782 and 0.374 for AUC and MCC respectively. SLE showed an increase in 8.16% in AUC and 20.6% in MCC over the baseline which excluded textual data and a 2.69% and 5.
We consider inverse problems related to the velocity reconstruction in electrically conducting fluids from externally measured magnetic fields. The underlying theory is presented in the framework of the integral equation approach to homogeneous dynamos in finite domains, which can be cast into a linear inverse problem in case that the magnetic Reynolds number is not too large. Some mathematical problems of the inversion, including the uniqueness problem in the sphere and a paradigmatic isospectrality problem for mean-field dynamos, are touched upon. For practical purposes, the inversion is carried out with the help of Tikhonov regularization using a quadratic functional of the velocity field as penalty function. For the first time, we present results of an experiment in which the three-dimensional velocity field of a propellor driven flow in a liquid metal is reconstructed by a contactless inductive measuring technique.
Observable currents are spacetime local objects that induce physical observables when integrated on an auxiliary codimension one surface. Since the resulting observables are independent of local deformations of the integration surface, the currents themselves carry most of the information about the induced physical observables. I introduce observable currents in a multisymplectic framework for Lagrangian field theory over discrete spacetime. One family of examples is composed by Noether currents. A much larger family of examples is composed by currents, spacetime local objects, that encode the symplectic product between two arbitrary vectors tangent to the space of solutions. A weak version of observable currents, which in general are nonlocal, is also introduced. Weak observable currents can be used to separate points in the space of physically distinct solutions. It is shown that a large class of weak observable currents can be "improved" to become local. A Poisson bracket gives the space of observable currents the structure of a Lie algebra. Peierls bracket for bulk observables gives an algebra homomorphism mapping equivalence classes of bulk observables to weak observable curre
Software technology based on reuse is identified as a process of designing software for the reuse purpose. The software reuse is a process in which the existing software is used to build new software. A metric is a quantitative indicator of an attribute of an item or thing. Reusability is the likelihood for a segment of source code that can be used again to add new functionalities with slight or no modification. A lot of research has been projected using reusability in reducing code, domain, requirements, design etc., but very little work is reported using software reuse in medical domain. An attempt is made to bridge the gap in this direction, using the concepts of clustering and classifying the data based on the distance measures. In this paper cardiologic database is considered for study. The developed model will be useful for Doctors or Paramedics to find out the patients level in the cardiologic disease, deduce the medicines required in seconds and propose them to the patient. In order to measure the reusability K means clustering algorithm is used.
In this paper we consider the computation of approximate solutions for inverse problems in Hilbert spaces. In order to capture the special feature of solutions, non-smooth convex functions are introduced as penalty terms. By exploiting the Hilbert space structure of the underlying problems, we propose a fast iterative regularization method which reduces to the classical nonstationary iterated Tikhonov regularization when the penalty term is chosen to be the square of norm. Each iteration of the method consists of two steps: the first step involves only the operator from the problem while the second step involves only the penalty term. This splitting character has the advantage of making the computation efficient. In case the data is corrupted by noise, a stopping rule is proposed to terminate the method and the corresponding regularization property is established. Finally, we test the performance of the method by reporting various numerical simulations, including the image deblurring, the determination of source term in Poisson equation, and the de-autoconvolution problem.
In order to solve backward parabolic problems F. John [{\it Comm. Pure. Appl. Math.} (1960)] introduced the two constraints "$\|u(T)\|\le M$" and $\|u(0) - g \| \le δ$ where $u(t)$ satisfies the backward heat equation for $t\in(0,T)$ with the initial data $u(0).$ The {\it slow-evolution-from-the-continuation-boundary} (SECB) constraint has been introduced by A. Carasso in [{\it SIAM J. Numer. Anal.} (1994)] to attain continuous dependence on data for backward parabolic problems even at the continuation boundary $t=T$. The additional "SECB constraint" guarantees a significant improvement in stability up to $t=T.$ In this paper we prove that the same type of stability can be obtained by using only two constraints among the three. More precisely, we show that the a priori boundedness condition $\|u(T)\|\le M$ is redundant. This implies that the Carasso's SECB condition can be used to replace the a priori boundedness condition of F. John with an improved stability estimate. Also a new class of regularized solutions is introduced for backward parabolic problems with an SECB constraint. The new regularized solutions are optimally stable and we also provide a constructive scheme to comput