Purpose: We investigated the utilization of privacy-preserving, locally-deployed, open-source Large Language Models (LLMs) to extract diagnostic information from free-text cardiovascular magnetic resonance (CMR) reports. Materials and Methods: We evaluated nine open-source LLMs on their ability to identify diagnoses and classify patients into various cardiac diagnostic categories based on descriptive findings in 109 clinical CMR reports. Performance was quantified using standard classification metrics including accuracy, precision, recall, and F1 score. We also employed confusion matrices to examine patterns of misclassification across models. Results: Most open-source LLMs demonstrated exceptional performance in classifying reports into different diagnostic categories. Google's Gemma2 model achieved the highest average F1 score of 0.98, followed by Qwen2.5:32B and DeepseekR1-32B with F1 scores of 0.96 and 0.95, respectively. All other evaluated models attained average scores above 0.93, with Mistral and DeepseekR1-7B being the only exceptions. The top four LLMs outperformed our board-certified cardiologist (F1 score of 0.94) across all evaluation metrics in analyzing CMR reports.
Timely identification of issue reports reflecting software vulnerabilities is crucial, particularly for Internet-of-Things (IoT) where analysis is slower than non-IoT systems. While Machine Learning (ML) and Large Language Models (LLMs) detect vulnerability-indicating issues in non-IoT systems, their IoT use remains unexplored. We are the first to tackle this problem by proposing two approaches: (1) combining ML and LLMs with Natural Language Processing (NLP) techniques to detect vulnerability-indicating issues of 21 Eclipse IoT projects and (2) fine-tuning a pre-trained BERT Masked Language Model (MLM) on 11,000 GitHub issues for classifying \vul. Our best performance belongs to a Support Vector Machine (SVM) trained on BERT NLP features, achieving an Area Under the receiver operator characteristic Curve (AUC) of 0.65. The fine-tuned BERT achieves 0.26 accuracy, emphasizing the importance of exposing all data during training. Our contributions set the stage for accurately detecting IoT vulnerabilities from issue reports, similar to non-IoT systems.
With the growth of global maritime transportation, energy optimization has become crucial for reducing costs and ensuring operational efficiency. Shaft power is the mechanical power transmitted from the engine to the shaft and directly impacts fuel consumption, making its accurate prediction a paramount step in optimizing vessel performance. Power consumption is highly correlated with ship parameters such as speed and shaft rotation per minute, as well as weather and sea conditions. Frequent access to this operational data can improve prediction accuracy. However, obtaining high-quality sensor data is often infeasible and costly, making alternative sources such as noon reports a viable option. In this paper, we propose a transfer learning-based approach for predicting vessels shaft power, where a model is initially trained on high-frequency data from a vessel and then fine-tuned with low-frequency daily noon reports from other vessels. We tested our approach on sister vessels (identical dimensions and configurations), a similar vessel (slightly larger with a different engine), and a different vessel (distinct dimensions and configurations). The experiments showed that the mean abso
Screening mammography is high volume, time sensitive, and documentation heavy. Radiologists must translate subtle visual findings into consistent BI-RADS assessments, breast density categories, and structured narrative reports. While recent Vision Language Models (VLMs) enable image-to-text reporting, many rely on closed cloud systems or tightly coupled architectures that limit privacy, reproducibility, and adaptability. We present MammoWise, a local multi-model pipeline that transforms open source VLMs into mammogram report generators and multi-task classifiers. MammoWise supports any Ollama-hosted VLM and mammography dataset, and enables zero-shot, few-shot, and Chain-of-Thought prompting, with optional multimodal Retrieval Augmented Generation (RAG) using a vector database for case-specific context. We evaluate MedGemma, LLaVA-Med, and Qwen2.5-VL on VinDr-Mammo and DMID datasets, assessing report quality (BERTScore, ROUGE-L), BI-RADS classification, breast density, and key findings. Report generation is consistently strong and improves with few-shot prompting and RAG. Classification is feasible but sensitive to model and dataset choice. Parameter-efficient fine-tuning (QLoRA) of
In 1984 Edward Witten proposed that an extremely dense form of matter composed of up, down, and strange quarks may be stable at zero pressure (Witten, 1984). Massive nuggets of such dense matter, if they exist, may pass through the Earth and be detectable by the seismic signals they generate (de Rujula and Glashow, 1984). With this motivation we investigated over 1 million seismic data reports to the U.S. Geological Survey for the years 1990-1993 not associated with epicentral sources. We report two results: (1) with an average of about 0.16 unassociated reports per minute after data cuts, we found a significant excess over statistical expectation for sets with ten or more reports in ten minutes; and (2) in spite of a very small a priori probability from random reports, we found one set of reports with arrival times and other features appropriate to signals from an epilinear source. This event has the properties predicted for the passage of a nugget of strange quark matter (SQM) through the earth, although there is no direct confirmation from other phenomenologies.
We present a spectroscopic investigation of $^{169}\mathrm{Tm}^+$ that provides two key foundations for its use as a platform for advanced quantum applications. First, we establish the complete spectroscopic road map for optical cycling (including laser cooling) by performing high-resolution spectroscopy on $^{169}\mathrm{Tm}^+$ ions in an ion trap. We characterize the primary $313\,\mathrm{nm}$ and complementary $448/453\,\mathrm{nm}$ cycling transitions, identify the essential near-infrared repumping frequencies, and determine the magnetic-dipole hyperfine $A$ constants for all relevant levels. Second, we report a detailed characterization of a metastable state as a candidate for hosting a robust qubit, performing lifetime measurements and Zeeman-resolved microwave hyperfine spectroscopy with $\mathrm{kHz}$ precision.
In the face of an infectious disease, a key epidemiological measure is the basic reproduction number, which quantifies the average secondary infections caused by a single case in a susceptible population. In practice, the effective reproduction number, denoted as $R_t$, is widely used to assess the transmissibility of the disease at a given time $t$. Real-time estimating this metric is vital for understanding and managing disease outbreaks. Traditional statistical inference often relies on two assumptions. One is that samples are assumed to be drawn from a homogeneous population distribution, neglecting significant variations in individual transmission rates. The other is the ideal case reporting assumption, disregarding time delays between infection and reporting. In this paper, we thoroughly investigate these critical factors and assess their impact on estimating $R_t$. We first introduce negative binomial and Weibull distributions to characterize transmission rates and reporting delays, respectively, based on which observation and state equations are formulated. Then, we employ a Bayesian filtering for estimating $R_t$. Finally, validation using synthetic and empirical data demo
This paper describes a machine translation test set of documents from the auditing domain and its use as one of the "test suites" in the WMT19 News Translation Task for translation directions involving Czech, English and German. Our evaluation suggests that current MT systems optimized for the general news domain can perform quite well even in the particular domain of audit reports. The detailed manual evaluation however indicates that deep factual knowledge of the domain is necessary. For the naked eye of a non-expert, translations by many systems seem almost perfect and automatic MT evaluation with one reference is practically useless for considering these details. Furthermore, we show on a sample document from the domain of agreements that even the best systems completely fail in preserving the semantics of the agreement, namely the identity of the parties.
Influenza and respiratory syncytial virus (RSV) are the leading etiological agents of seasonal acute respiratory infections (ARI) around the world. Medical doctors typically base the diagnosis of ARI on patients' symptoms alone and do not always conduct virological tests necessary to identify individual viruses, which limits the ability to study the interaction between multiple pathogens and make public health recommendations. We consider a stochastic kinetic model (SKM) for two interacting ARI pathogens circulating in a large population and an empirically motivated background process for infections with other pathogens causing similar symptoms. An extended marginal sampling approach based on the Linear Noise Approximation to the SKM integrates multiple data sources and additional model components. We infer the parameters defining the pathogens' dynamics and interaction within a Bayesian hierarchical model and explore the posterior trajectories of infections for each illness based on aggregate infection reports from six epidemic seasons collected by the state health department, and a subset of virological tests from a sentinel program at a general hospital in San Luis Potosí, Méxic
This Preliminary Design Report (PDR) describes the IsoDAR electron-antineutrino source in two volumes which are mostly site-independent and describe the cyclotron driver providing a 60 MeV, 10 mA proton beam (this Volume); and the medium energy beam transport line (MEBT) and target (Volume II). The IsoDAR driver and target will produce about 1.15e23 electron-antineutrinos over five years. Paired with a kton-scale liquid scintillator detector, it will enable a broad particle physics program including searches for new symmetries, new interactions and new particles. Here in Volume I, we describe the driver, which includes the ion source, low energy beam transport, and cyclotron. The latter features Radio-Frequency Quadrupole (RFQ) direct axial injection and represents the first accelerator purpose-built to make use of so-called vortex motion.
The portrayal of crowd accidents by the media can influence public understanding and emotional response, shaping societal perceptions and potentially impacting safety measures and preparedness strategies. This paper critically examines the portrayal of crowd accidents in news coverage by analyzing the texts of 372 media reports of crowd accidents spanning 26 diverse news sources from 1900 to 2019. We investigate how media representations of crowd accidents vary across time and geographical origins. Our methodology combines lexical analysis to unveil prevailing terminologies and sentiment analysis to discern the emotional tenor of the reports. The findings reveal the prevalence of the term "stampede" over "panic" in media descriptions of crowd accidents. Notably, divergent patterns are observable when comparing Western versus South Asian media (notably India and Pakistan), unveiling a cross-cultural dimension. Moreover, the analysis detects a gradual transition from "crowd stampede" to "crowd crush" in media and Wikipedia narratives in recent years, suggesting evolving lexical sensitivities. Sentiment analysis uncovers a consistent association with fear-related language, indicative
Extensive air showers induced from high-energy cosmic rays provide a window into understanding the most energetic phenomena in the universe. We present a new method for observing these showers using the silicon imaging detector Subaru Hyper Suprime-Cam (HSC). This method has the advantage of being able to measure individual secondary particles. When paired with a surface detector array, silicon imaging detectors like Subaru HSC will be useful for studying the properties of extensive air showers in detail. The following report outlines the first results of observing extensive air showers with Subaru HSC. The potential for reconstructing the incident direction of primary cosmic rays is demonstrated and possible interdisciplinary applications are discussed.
In the past three decades, many stars orbiting about the supermassive black hole (SMBH) at the Galactic Centre (Sgr A*) were identified. Their orbital nature can give stringent constraints for the mass of the SMBH. In particular, the star S2 has completed at least one period since our first detection of its position, which can provide rich information to examine the properties of the SMBH, and the astrophysical environment surrounding the SMBH. Here, we report an interesting phenomenon that if a significant amount of dark matter or stellar mass is distributed around the SMBH, the precession speed of the S2 stellar orbit could be `slow down' by at most 27\% compared with that without dark matter surrounding the SMBH, assuming the optimal dark matter scenario. We anticipate that future high quality observational data of the S2 stellar orbit or other stellar orbits can help reveal the actual mass distribution near the SMBH and the nature of dark matter.
Spin angular momentum transfer in magnetic bilayers offers the possibility of ultrafast and low-loss operation for next-generation spintronic devices. We report the field- and temperature- dependent measurements on the magnetization precessions in Co$_2$FeAl/(Ga,Mn)As by time-resolved magneto-optical Kerr effect (TRMOKE). Analysis of the effective Gilbert damping and phase shift indicates a clear signature of an enhanced dynamic exchange coupling between the two ferromagnetic (FM) layers due to the reinforced spin pumping at resonance. The temperature dependence of the dynamic exchange-coupling reveals a primary contribution from the ferromagnetism in (Ga,Mn)As.
News reports in media contain records of a wide range of socio-economic and political events in time. Using a publicly available, large digital database of news records, and aggregating them over time, we study the network of ethnic conflicts and human rights violations. Complex network analyses of the events and the involved actors provide important insights on the engaging actors, groups, establishments and sometimes nations, pointing at their long range effect over space and time. We find power law decays in distributions of actor mentions, co-actor mentions and degrees and dominance of influential actors and groups. Most influential actors or groups form a giant connected component which grows in time, and is expected to encompass all actors globally in the long run. We demonstrate how targeted removal of actors may help stop spreading unruly events. We study the cause-effect relation between types of events, and our quantitative analysis confirm that ethnic conflicts lead to human rights violations, while it does not support the converse.
Beyond generating long and topic-coherent paragraphs in traditional captioning tasks, the medical image report composition task poses more task-oriented challenges by requiring both the highly-accurate medical term diagnosis and multiple heterogeneous forms of information including impression and findings. Current methods often generate the most common sentences due to dataset bias for individual case, regardless of whether the sentences properly capture key entities and relationships. Such limitations severely hinder their applicability and generalization capability in medical report composition where the most critical sentences lie in the descriptions of abnormal diseases that are relatively rare. Moreover, some medical terms appearing in one report are often entangled with each other and co-occurred, e.g. symptoms associated with a specific disease. To enforce the semantic consistency of medical terms to be incorporated into the final reports and encourage the sentence generation for rare abnormal descriptions, we propose a novel framework that unifies template retrieval and sentence generation to handle both common and rare abnormality while ensuring the semantic-coherency amon
"Winning" bets were made on cloned website and would have lost money, WSJ finds
Researchers found that a Chinese sodium-ion battery performs far better than expected, with production quality and design features comparable to Tesla’s batteries。 If engineers can improve cold-weather charging and energy density, sodium could become a cheaper and more abundant alternative to lithium for EVs and large-scale energy storage