Background: Artificial intelligence (AI) has emerged as a disruptive innovation in medicine, yet its adoption within gastroenterology remains limited and poorly characterized. We aimed to examine knowledge, practical applications, perceived barriers, and expectations regarding AI among gastroenterology specialists in Spain. Methods: We conducted a cross-sectional observational study using a structured online survey distributed by the Spanish Society of Digestive Pathology (SEPD) in 2025. The questionnaire collected sociodemographic data, patterns of AI use, perceptions, and educational needs. Descriptive statistics and multivariable models were applied. Results: Among 283 respondents (mean age 44.6 +/- 9.7 years), 87.5% acknowledged AI as a transformative tool, but only 60.2% (95% CI: 54.3-66.1%) reported using it, mostly outside institutional frameworks. Notably, 80.2% of users initiated AI use within the past year. Independent predictors of frequent use included previous training (OR=2.44), employment in university hospitals (OR=2.14), and younger age (OR=1.36 per 5-year decrease). Main barriers were lack of training (61%), absence of institutional strategies (46%), and ethical c
This study evaluated self-reported response certainty across several large language models (GPT, Claude, Llama, Phi, Mistral, Gemini, Gemma, and Qwen) using 300 gastroenterology board-style questions. The highest-performing models (GPT-o1 preview, GPT-4o, and Claude-3.5-Sonnet) achieved Brier scores of 0.15-0.2 and AUROC of 0.6. Although newer models demonstrated improved performance, all exhibited a consistent tendency towards overconfidence. Uncertainty estimation presents a significant challenge to the safe use of LLMs in healthcare. Keywords: Large Language Models; Confidence Elicitation; Artificial Intelligence; Gastroenterology; Uncertainty Quantification
Background and Aims: This study evaluates the medical reasoning performance of large language models (LLMs) and vision language models (VLMs) in gastroenterology. Methods: We used 300 gastroenterology board exam-style multiple-choice questions, 138 of which contain images to systematically assess the impact of model configurations and parameters and prompt engineering strategies utilizing GPT-3.5. Next, we assessed the performance of proprietary and open-source LLMs (versions), including GPT (3.5, 4, 4o, 4omini), Claude (3, 3.5), Gemini (1.0), Mistral, Llama (2, 3, 3.1), Mixtral, and Phi (3), across different interfaces (web and API), computing environments (cloud and local), and model precisions (with and without quantization). Finally, we assessed accuracy using a semiautomated pipeline. Results: Among the proprietary models, GPT-4o (73.7%) and Claude3.5-Sonnet (74.0%) achieved the highest accuracy, outperforming the top open-source models: Llama3.1-405b (64%), Llama3.1-70b (58.3%), and Mixtral-8x7b (54.3%). Among the quantized open-source models, the 6-bit quantized Phi3-14b (48.7%) performed best. The scores of the quantized models were comparable to those of the full-precision
In modern collider experiments, the quest to explore fundamental interactions between elementary particles has reached unparalleled levels of precision. Signatures from particle physics detectors are low-level objects (such as energy depositions or tracks) encoding the physics of collisions (the final state particles of hard scattering interactions). The complete simulation of them in a detector is a computational and storage-intensive task. To address this computational bottleneck in particle physics, alternative approaches have been developed, introducing additional assumptions and trade off accuracy for speed.The field has seen a surge in interest in surrogate modeling the detector simulation, fueled by the advancements in deep generative models. These models aim to generate responses that are statistically identical to the observed data. In this paper, we conduct a comprehensive and exhaustive taxonomic review of the existing literature on the simulation of detector signatures from both methodological and application-wise perspectives. Initially, we formulate the problem of detector signature simulation and discuss its different variations that can be unified. Next, we classify
This paper investigates sentiment classification of Steam game reviews using an attention-based Bidirectional Long Short-Term Memory (BiLSTM) model. Using a dataset of 50,000 reviews sampled from a larger Steam review corpus, the authors compare a traditional machine learning baseline based on TF-IDF and PyCaret AutoML with a deep learning approach implemented in PyTorch. The proposed BiLSTM+Attention model is trained with class-weighted cross-entropy to address class imbalance and achieves 83% accuracy and 85% weighted F1-score on the test set, with 90% recall for negative reviews. The paper also presents attention visualizations to show interpretability by highlighting sentiment-bearing words. The study concludes that the BiLSTM+Attention model is effective for analyzing user sentiment in Steam reviews and useful for helping developers understand player feedback.
AI coding agents increasingly accept assigned software tasks, modify repositories under bounded authority, and return work packages for review. Prior work proposed the software delegation contract, covering the task, authority, returned work package, and acceptance context, as the unit of analysis for delegated coding work, but did not measure its effects. This paper reports a controlled pilot study of explicit delegation contracts for coding agents. We built a dependency-free TypeScript API task environment with seeded defects and documentation gaps, authored ten tasks across five families, and ran 64 agent executions across two model tiers under three conditions: a realistic issue-style prompt, an explicit delegation contract, and a contract with a required evidence bundle. Each run was scored with hidden acceptance tests, mutation checks, and scope analysis, then reviewed by three independent condition-blinded model-based reviewers using a fixed rubric, for 192 reviews. Explicit contracts did not improve objective task outcomes: all 64 runs passed hidden acceptance checks, with zero scope violations. They did improve reviewability. Evidence sufficiency improved in 22 of 30 paire
Compression is essential to storing and transmitting medical videos, but the effect of compression on downstream medical tasks is often ignored. Furthermore, systems in practice rely on standard video codecs, which naively allocate bits between medically relevant frames or parts of frames. In this work, we present an empirical study of some deficiencies of classical codecs on gastroenterology videos, and motivate our ongoing work to train a learned compression model for colonoscopy videos. We show that two of the most common classical codecs, H264 and HEVC, compress medically relevant frames statistically significantly worse than medically nonrelevant ones, and that polyp detector performance degrades rapidly as compression increases. We explain how a learned compressor could allocate bits to important regions and allow detection performance to degrade more gracefully. Many of our proposed techniques generalize to medical video domains beyond gastroenterology
Possible topological nature of Kondo and mixed valence insulators has been a recent topic of interest in condensed matter physics. Attention has focused on SmB6, which has long been known to exhibit low temperature transport anomaly, whose origin is of independent interest. We argue that it is possible to resolve the topological nature of surface states by uniquely accessing the surface electronic structure of the low temperature anomalous transport regime through combining state-of-the-art laser- and synchrotron-based angle-resolved photoemission spectroscopy (ARPES) with or without spin resolution. A combination of low temperature and ultra-high resolution (laser) which is lacking in previous ARPES studies of this compound is the key to resolve the possible existence of topological surface state in SmB6. Here we outline an experimental algorithm to systematically explore the topological versus trivial or mixed (topological and trivial surface state admixture as in the first 3D TI Bi$_{1-x}$Sb$_x$) nature of the surface states in Kondo and mixed valence insulators. We conclude based on this methodology that the observed topology of the surface Fermi surface in our low temperature
Ultralight dark matter refers to the lightest potential dark matter candidates. We will focus on the mass range that has been studied using astrophysical and cosmological observations, corresponding to a mass $10^{-24} \, \mathrm{eV} \lesssim m \lesssim 10^{-18} \, \mathrm{eV}$. We will discuss the motivations for this mass range. The most studied model in this range corresponds to a minimally coupled, single, classical, spin-0 field comprising all dark matter. However, the work exploring extensions of this model (for example, higher spin, self-coupled, multiple field, and mixed models) will be one of the focuses of this review. The phenomenology associated with ultralight dark matter is rich and includes linear effects on the primordial power spectrum, core structures forming at the center of halos, nonlinear effects resulting in heating of stellar distributions, and non-relativistic effects relating to pulsar signals and black hole superradiance, to name a few. This set of effects has been studied using an equally extensive set of numerical tools. We will summarize the most common ones and discuss their applications and limitations. Ultralight dark matter also has a wide variety
The Great Divide in metaphysical debates about laws of nature is between Humeans, who think that laws merely describe the distribution of matter, and non-Humeans, who think that laws govern it. The metaphysics can place demands on the proper formulations of physical theories. It is sometimes assumed that the governing view requires a fundamental / intrinsic direction of time: to govern, laws must be dynamical, producing later states of the world from earlier ones, in accord with the fundamental direction of time in the universe. In this paper, we propose a minimal primitivism about laws of nature (MinP) according to which there is no such requirement. On our view, laws govern by constraining the physical possibilities. Our view captures the essence of the governing view without taking on extraneous commitments about the direction of time or dynamic production. Moreover, as a version of primitivism, our view requires no reduction / analysis of laws in terms of universals, powers, or dispositions. Our view accommodates several potential candidates for fundamental laws, including the principle of least action, the Past Hypothesis, the Einstein equation of general relativity, and even
We briefly review the various contexts within which one might address the issue of ``why'' the dimensionless constants of Nature have the particular values that they are observed to have. Both the general historical trend, in physics, of replacing a-priori-given, absolute structures by dynamical entities, and anthropic considerations, suggest that coupling ``constants'' have a dynamical nature. This hints at the existence of observable violations of the Equivalence Principle at some level, and motivates the need for improved tests of the Equivalence Principle.
Planets form and obtain their compositions from the leftover material present in protoplanetary disks of dust and gas surrounding young stars. The chemical make-up of a disk influences every aspect of planetary composition including their overall chemical properties, volatile content, atmospheric composition, and potential for habitability. This Review discusses our knowledge of the chemical and isotopic composition of Solar System materials and how this information can be used to place constraints on the formation pathways of terrestrial planets. We conclude that planetesimal formation by the streaming instability followed by rapid accretion of drifting pebbles within the protoplanetary disk lifetime reproduces most of the chemical and isotopic observables in Solar System. This finding has important implications for planetary habitability beyond the Solar System because in pebble accretion, volatiles important for life are accreted during the main growth phase of rocky planets as opposed to the late-stage. Finally, we explore how bulk chemical inventories and masses of planetary bodies control the composition of their primordial atmospheres and their potential to develop habitable
Natural products, as metabolites from microorganisms, animals, or plants, exhibit diverse biological activities, making them crucial for drug discovery. Nowadays, existing deep learning methods for natural products research primarily rely on supervised learning approaches designed for specific downstream tasks. However, such one-model-for-a-task paradigm often lacks generalizability and leaves significant room for performance improvement. Additionally, existing molecular characterization methods are not well-suited for the unique tasks associated with natural products. To address these limitations, we have pre-trained a foundation model for natural products based on their unique properties. Our approach employs a novel pretraining strategy that is especially tailored to natural products. By incorporating contrastive learning and masked graph learning objectives, we emphasize evolutional information from molecular scaffolds while capturing side-chain information. Our framework achieves state-of-the-art (SOTA) results in various downstream tasks related to natural product mining and drug discovery. We first compare taxonomy classification with synthesized molecule-focused baselines t
In a recent publication, we demonstrated electrical spin injection and detection in n-type silicon at temperatures up to 500K using ferromagnetic metal / SiO2 tunnel barrier contacts in a three-terminal geometry (Nature Commun. 2:245 doi:10.1038/ncomms125 (2011)). In comparing our measured spin-voltage signal with the value predicted by theory, we followed the analysis of Tran et al, (Phys. Rev. Lett. 102, 036601 (2009)), and inadvertently propagated an error found therein. As they note in a recent erratum (arXiv:0810.4770v2), the correct expression for the spin resistance area product from the theory for a sample with a spin diffusion length LSD much less than the contact width or channel thickness (our experimental situation) is given by the product {gamma}^2 {rho} LSD, where {gamma} is the tunneling spin polarization, and {rho} is the resistivity of the semiconductor transport channel. With this correction, our measured spin voltages are much larger than those predicted by theory, rather than in good agreement as we stated. We emphasize that the basic conclusions of our paper are the same - the systematic decrease in electron spin lifetime with increasing electron density demons
Massless Dirac electrons in condensed matter have attracted considerable attention. Unlike conventional electrons, Dirac electrons are described in the form of two-component wave functions. In the surface state of topological insulators, these two components are associated with the spin degrees of freedom, hence governing the magnetic properties. Therefore, the observation of the two-component wave function provides a useful clue for exploring the novel spin phenomena. Here we show that the two-component nature is manifested in the Landau levels (LLs) whose degeneracy is lifted by a Coulomb potential. Using spectroscopic-imaging scanning tunneling microscopy, we visualize energy and spatial structures of LLs in a topological insulator Bi2Se3. The observed potential-induced LL splitting and internal structures of Landau orbits are distinct from those in a conventional electron system and are well reproduced by a two-component model Dirac Hamiltonian. Our model further predicts non-trivial energy-dependent spin-magnetization textures in a potential variation. This provides a way to manipulate spins in the topological surface state.
Increasing demands on medical imaging departments are taking a toll on the radiologist's ability to deliver timely and accurate reports. Recent technological advances in artificial intelligence have demonstrated great potential for automatic radiology report generation (ARRG), sparking an explosion of research. This survey paper conducts a methodological review of contemporary ARRG approaches by way of (i) assessing datasets based on characteristics, such as availability, size, and adoption rate, (ii) examining deep learning training methods, such as contrastive learning and reinforcement learning, (iii) exploring state-of-the-art model architectures, including variations of CNN and transformer models, (iv) outlining techniques integrating clinical knowledge through multimodal inputs and knowledge graphs, and (v) scrutinising current model evaluation techniques, including commonly applied NLP metrics and qualitative clinical reviews. Furthermore, the quantitative results of the reviewed models are analysed, where the top performing models are examined to seek further insights. Finally, potential new directions are highlighted, with the adoption of additional datasets from other rad
This paper describes a rapid feasibility study of using GPT-4, a large language model (LLM), to (semi)automate data extraction in systematic reviews. Despite the recent surge of interest in LLMs there is still a lack of understanding of how to design LLM-based automation tools and how to robustly evaluate their performance. During the 2023 Evidence Synthesis Hackathon we conducted two feasibility studies. Firstly, to automatically extract study characteristics from human clinical, animal, and social science domain studies. We used two studies from each category for prompt-development; and ten for evaluation. Secondly, we used the LLM to predict Participants, Interventions, Controls and Outcomes (PICOs) labelled within 100 abstracts in the EBM-NLP dataset. Overall, results indicated an accuracy of around 80%, with some variability between domains (82% for human clinical, 80% for animal, and 72% for studies of human social sciences). Causal inference methods and study design were the data extraction items with the most errors. In the PICO study, participants and intervention/control showed high accuracy (>80%), outcomes were more challenging. Evaluation was done manually; scoring
Upon mechanical loading, granular materials yield and undergo plastic deformation. The nature of plastic deformation is essential for the development of the macroscopic constitutive models and the understanding of shear band formation. However, we still do not fully understand the microscopic nature of plastic deformation in disordered granular materials. Here we used synchrotron X-ray tomography technique to track the structural evolutions of three-dimensional granular materials under shear. We establish that highly distorted coplanar tetrahedra are the structural defects responsible for microscopic plasticity in disordered granular packings. The elementary plastic events occur through flip events which correspond to a neighbor switching process among these coplanar tetrahedra (or equivalently as the rotation motion of 4-ring disclinations). These events are discrete in space and possess specific orientations with the principal stress direction.
This paper reviews the work done on black hole interior volume, entropy, and evaporation. An insight into the basics for understanding the interior volume is presented. A general analogy to investigate the interior volume of a black hole, the associated quantum mode's entropy, and the evolution relation between the interior and exterior entropy is explained. Using this analogy, we predicted the future of information stored in a BH, its radiation, and evaporation. The results are noted in tables (\ref{tab:1}) and (\ref{tab:2}). To apply this analogy in BH space-time, we investigated the interior volume, entropy, and evaluation relation for different types of BHs. Finally, we also investigated the nature of BH radiation and the probability of particle emission during the evaporation process.
Understanding how humans conceptualize and categorize natural objects offers critical insights into perception and cognition. With the advent of Large Language Models (LLMs), a key question arises: can these models develop human-like object representations from linguistic and multimodal data? In this study, we combined behavioral and neuroimaging analyses to explore the relationship between object concept representations in LLMs and human cognition. We collected 4.7 million triplet judgments from LLMs and Multimodal LLMs (MLLMs) to derive low-dimensional embeddings that capture the similarity structure of 1,854 natural objects. The resulting 66-dimensional embeddings were stable, predictive, and exhibited semantic clustering similar to human mental representations. Remarkably, the dimensions underlying these embeddings were interpretable, suggesting that LLMs and MLLMs develop human-like conceptual representations of objects. Further analysis showed strong alignment between model embeddings and neural activity patterns in brain regions such as EBA, PPA, RSC, and FFA. This provides compelling evidence that the object representations in LLMs, while not identical to human ones, share