With upcoming sample return missions across the solar system and the increasing availability of mass spectrometry data, there is an urgent need for methods that analyze such data within the context of existing astrobiology literature and generate plausible hypotheses regarding the emergence of life on Earth. Hypothesis generation from mass spectrometry data is challenging due to factors such as environmental contaminants, the complexity of spectral peaks, and difficulties in cross-matching these peaks with prior studies. To address these challenges, we introduce AstroAgents, a large language model-based, multi-agent AI system for hypothesis generation from mass spectrometry data. AstroAgents is structured around eight collaborative agents: a data analyst, a planner, three domain scientists, an accumulator, a literature reviewer, and a critic. The system processes mass spectrometry data alongside user-provided research papers. The data analyst interprets the data, and the planner delegates specific segments to the scientist agents for in-depth exploration. The accumulator then collects and deduplicates the generated hypotheses, and the literature reviewer identifies relevant literat
Mass spectrometry is a widely used method to study molecules and processes in medicine, life sciences, chemistry, catalysis, and industrial product quality control, among many other applications. One of the main features of some mass spectrometry techniques is the extensive level of characterization (especially when coupled with chromatography and ion mobility methods, or a part of tandem mass spectrometry experiment) and a large amount of generated data per measurement. Terabyte scales can be easily reached with mass spectrometry studies. Consequently, mass spectrometry has faced the challenge of a high level of data disappearance. Researchers often neglect and then altogether lose access to the rich information mass spectrometry experiments could provide. With the development of machine learning methods, the opportunity arises to unlock the potential of these data, enabling previously inaccessible discoveries. The present perspective highlights reevaluation of mass spectrometry data analysis in the new generation of methods and describes significant challenges in the field, particularly related to problems involving the use of electrospray ionization. We argue that further applic
In this work we present and evaluate a radiochemical procedure optimised for the analysis of $^{236}$U and $^{239,240}$Pu in seawater samples by Accelerator Mass Spectrometry (AMS). The method is based on Fe(OH)$_3$ co-precipitation of actinides and uses TEVA and UTEVA extraction chromatography resins in a simplified way for the final U and Pu purification. In order to improve the performance of the method, the radiochemical yields are analysed in 1 to 10 L seawater volumes using alpha spectrometry (AS) and Inductively Coupled Plasma Mass Spectrometry (ICP-MS). Robust 80% plutonium recoveries are obtained; however, it is found that Fe(III) concentration in the precipitation solution and sample volume are the two critical and correlated parameters influencing the initial uranium extraction through Fe(OH)$_3$ co-precipitation. Therefore, we propose an expression that optimises the sample volume and Fe(III) amounts according to both the $^{236}$U and $^{239,240}$Pu concentrations in the samples and the performance parameters of the AMS facility. The method is validated for the current setup of the 1 MV AMS system (CNA, Sevilla, Spain), where He gas is used as a stripper, by analysing
We report a proof-of-principle study demonstrating the first capture and time-of-flight spectrometry of highly charged ions (HCIs) produced following antiproton annihilations in a Penning-Malmberg trap. A multi-step nested-trap technique was developed using the \aegis\ experiment to identify annihilation-linked captured ions. The trapping and spectrometry of helium and argon ions demonstrates the approach. This work establishes a foundation for the in-trap synthesis of radioactive HCIs and the study of cold nuclear annihilation fragments, with the long-term goal of enabling a sensitive tool for probing the outer nuclear periphery.
Mass spectrometry is the dominant technology in the field of proteomics, enabling high-throughput analysis of the protein content of complex biological samples. Due to the complexity of the instrumentation and resulting data, sophisticated computational methods are required for the processing and interpretation of acquired mass spectra. Machine learning has shown great promise to improve the analysis of mass spectrometry data, with numerous purpose-built methods for improving specific steps in the data acquisition and analysis pipeline reaching widespread adoption. Here, we propose unifying various spectrum prediction tasks under a single foundation model for mass spectra. To this end, we pre-train a spectrum encoder using de novo sequencing as a pre-training task. We then show that using these pre-trained spectrum representations improves our performance on the four downstream tasks of spectrum quality prediction, chimericity prediction, phosphorylation prediction, and glycosylation status prediction. Finally, we perform multi-task fine-tuning and find that this approach improves the performance on each task individually. Overall, our work demonstrates that a foundation model for
Molecular assembly offers a promising path to detect life beyond Earth, while minimizing assumptions based on terrestrial life. As mass spectrometers will be central to upcoming Solar System missions, predicting molecular assembly from their data without needing to elucidate unknown structures will be essential for unbiased life detection. An ideal agnostic biosignature must be interpretable and experimentally measurable. Here, we show that molecular assembly, a recently developed approach to measure objects that have been produced by evolution, satisfies both criteria. First, it is interpretable for life detection, as it reflects the assembly of molecules with their bonds as building blocks, in contrast to approaches that discount construction history. Second, it can be determined without structural elucidation, as it can be physically measured by mass spectrometry, a property that distinguishes it from other approaches that use structure-based information measures for molecular complexity. Whilst molecular assembly is directly measurable using mass spectrometry data, there are limits imposed by mission constraints. To address this, we developed a machine learning model that predi
A radiochemical method for the isolation of plutonium isotopes from environmental samples, based on the use of specific chromatography resins for actinides (TEVA, Eichrom Industries), has been set up in our laboratory and optimised for their posterior determination by alpha spectrometry (AS) or accelerator mass spectrometry (AMS). The proposed radiochemical method has replaced in our lab a well established one based on the use of a relatively un-specific anion-exchange resin (AG1X8, Biorad), because it is clearly less time consuming, reduces the amounts and molarities of acid wastes produced, and reproducibly gives high radiochemical yields. In order to check the reliability of the proposed radiochemical method for the determination of plutonium isotopes in different environmental matrixes, twin aliquots of a set of samples were prepared with TEVA and with AG 1X8 resins and measured by AS. Some samples prepared with TEVA resins were measured as well by AMS. As it is shown in the text, there is a comfortable agreement between AS and AMS, which adequately validates the method.
Velocity-map imaging of electrons is a pivotal technique in chemical physics. A recent study reported a quantum offset as large as 0.2 cm-1 in velocity imaging-based electron spectrometry [Phys. Rev. Lett. 134, 043001 (2025)]. In this work, we assess the existence this offset through a combination of simulations and experiments. Our simulations reveal that the velocity imaging results reconstructed using the maximum entropy velocity Legendre reconstruction (MEVELER) method exhibit no such offset. Furthermore, experimental measurements of the electron affinity of oxygen conducted at various imaging voltages show no discernible offset attributable to the electric field in the photodetachment region. Therefore, we conclude that there is no evidence for the claimed quantum offset in properly analyzed velocity imaging-based electron spectrometry.
Decay energy spectrometry (DES) is a novel radiometric technique for high-precision analysis of nuclear materials. DES employs the unique thermal detection physics of cryogenic microcalorimeters with ultra-high energy resolution and 100$\%$ detection efficiency to accomplish high precision decay energy measurements. Low-activity nuclear samples of 1 Bq or less, and without chemical separation, are used to provide elemental and isotopic compositions in a single measurement. Isotopic ratio precisions of 1 ppm - 1,000 ppm (isotope dependent), which is close to that of the mass spectrometry, have been demonstrated in 12-hour DES measurements of ~5 Bq samples of certified reference materials of uranium (U) and plutonium (Pu). DES has very different systematic biases and uncertainties, as well as different sensitivities to nuclides, compared to mass-spectrometry techniques. Therefore, the accuracy and confidence of nuclear material assays can be improved by combining this new technique with existing mass-spectrometry techniques. Commercial-level DES techniques and equipment are being developed for the implementation of DES at the Nuclear Material Laboratory (NML) of International Atomic
A dedicated isochronous storage ring, named the Rare-RI Ring, was constructed at the RI Beam Factory of RIKEN, aiming at precision mass measurements of nuclei located in uncharted territories of the nuclear chart. The Rare-RI Ring employs the isochronous mass spectrometry technique with the goal to achieve a relative mass precision of $10^{-6}$ within a measurement time of less than 1 ms. The performance of the facility was demonstrated through mass measurements of neutron-rich nuclei with well-known masses. Velocity or magnetic rigidity is measured for every particle prior to its injection into the ring, wherein its revolution time is accurately determined. The latter quantity is used to determine the mass of the particle, while the former one is needed for non-isochronicity corrections. Mass precisions on the order of $10^{-5}$ were achieved in the first commissioning, which demonstrates that Rare-RI Ring is a powerful tool for mass spectrometry of short-lived nuclei.
This paper presents an application of artificial intelligence on mass spectrometry data for detecting habitability potential of ancient Mars. Although data was collected for planet Mars the same approach can be replicated for any terrestrial object of our solar system. Furthermore, proposed methodology can be adapted to any domain that uses mass spectrometry. This research is focused in data analysis of two mass spectrometry techniques, evolved gas analysis (EGA-MS) and gas chromatography (GC-MS), which are used to identify specific chemical compounds in geological material samples. The study demonstrates the applicability of EGA-MS and GC-MS data to extra-terrestrial material analysis. Most important features of proposed methodology includes square root transformation of mass spectrometry values, conversion of raw data to 2D sprectrograms and utilization of specific machine learning models and techniques to avoid overfitting on relative small datasets. Both EGA-MS and GC-MS datasets come from NASA and two machine learning competitions that the author participated and exploited. Complete running code for the GC-MS dataset/competition is available at GitHub.1 Raw training mass spect
Tandem mass spectrometry (MS/MS) stands as the predominant high-throughput technique for comprehensively analyzing protein content within biological samples. This methodology is a cornerstone driving the advancement of proteomics. In recent years, substantial strides have been made in Data-Independent Acquisition (DIA) strategies, facilitating impartial and non-targeted fragmentation of precursor ions. The DIA-generated MS/MS spectra present a formidable obstacle due to their inherent high multiplexing nature. Each spectrum encapsulates fragmented product ions originating from multiple precursor peptides. This intricacy poses a particularly acute challenge in de novo peptide/protein sequencing, where current methods are ill-equipped to address the multiplexing conundrum. In this paper, we introduce DiaTrans, a deep-learning model based on transformer architecture. It deciphers peptide sequences from DIA mass spectrometry data. Our results show significant improvements over existing STOA methods, including DeepNovo-DIA and PepNet. Casanovo-DIA enhances precision by 15.14% to 34.8%, recall by 11.62% to 31.94% at the amino acid level, and boosts precision by 59% to 81.36% at the pepti
Fingerprint analysis is a ubiquitous tool for pattern recognition with applications spanning from geolocation and DNA analysis to facial recognition and forensic identification. Central to its utility is the ability to provide accurate identification without an a priori mathematical model for the pattern. We report a data-driven fingerprint approach for nanoelectromechanical systems mass spectrometry (NEMS-MS) that enables mass measurements of particles and molecules using complex, uncharacterized nanoelectromechanical devices of arbitrary specification. NEMS-MS is based on the frequency shifts of the NEMS vibrational modes induced by analyte adsorption. The sequence of frequency shifts constitutes a fingerprint of this adsorption, which is directly amenable to pattern matching. Two current requirements of NEMS-based mass spectrometry are: (1) a priori knowledge or measurement of the device mode-shapes, and (2) a mode-shape-based model that connects the induced modal frequency shifts to mass adsorption. This may not be possible for advanced NEMS with three-dimensional mode-shapes and nanometer-sized features. The advance reported here eliminates this impediment, thereby allowing de
Imaging mass spectrometry (IMS) is a powerful tool for untargeted, highly multiplexed molecular mapping of tissue in biomedical research. IMS offers a means of mapping the spatial distributions of molecular species in biological tissue with unparalleled chemical specificity and sensitivity. However, most IMS platforms are not able to achieve microscopy-level spatial resolution and lack cellular morphological contrast, necessitating subsequent histochemical staining, microscopic imaging and advanced image registration steps to enable molecular distributions to be linked to specific tissue features and cell types. Here, we present a virtual histological staining approach that enhances spatial resolution and digitally introduces cellular morphological contrast into mass spectrometry images of label-free human tissue using a diffusion model. Blind testing on human kidney tissue demonstrated that the virtually stained images of label-free samples closely match their histochemically stained counterparts (with Periodic Acid-Schiff staining), showing high concordance in identifying key renal pathology structures despite utilizing IMS data with 10-fold larger pixel size. Additionally, our a
Aerosols found in the atmosphere affect the climate and worsen air quality. To mitigate these adverse impacts, aerosol formation and aerosol chemistry in the atmosphere need to be better mapped out and understood. Currently, mass spectrometry is the single most important analytical technique in atmospheric chemistry and is used to track and identify compounds and processes. Vast amounts of data are collected in each measurement of current time-of-flight and orbitrap mass spectrometers using modern rapid data acquisition practices. However, compound identification remains as a major bottleneck during data analysis due to lacking reference libraries and analysis tools. Data-driven compound identification approaches could alleviate the problem, yet remain rare to non-existent in atmospheric science. In this perspective, we review the current state of data-driven compound identification with mass spectrometry in atmospheric science, and discuss current challenges and possible future steps towards a digital mass spectrometry era in atmospheric science.
While studying nucleic acids to reveal the weak interactions responsible for their three-dimensional structure and for their interactions with drugs, we also contributed to the field of biomolecular mass spectrometry, both in terms of fundamental understanding and with new methodological developments. A first goal was to develop mass spectrometry approaches to detect non-covalent interactions between antitumor drugs and their DNA target. Twenty years ago, our attention turned towards specific DNA structures such as the G-quadruplex (a structure formed by guanine-rich strands). Mass spectrometry allows to discern which molecules interact with one another by measuring the masses of the complexes, and quantity the affinities by measuring their abundance. The most important findings came from unexpected masses. For example, we showed the formation of higher- or lower-order structures by G-quadruplexes used in traditional biophysical assays. We can also derive complete thermodynamic and kinetic description of G-quadruplex folding pathways by measuring cation binding, one at a time. Getting quantitative information required accounting for nonspecific adduct formation and for the response
The Mars Spectrometry 2: Gas Chromatography challenge was sponsored by NASA and run on the DrivenData competition platform in 2022. This report describes the solution which achieved the second-best score on the competition's test dataset. The solution utilized two-dimensional, image-like representations of the competition's chromatography data samples. A number of different Convolutional Neural Network models were trained and ensembled for the final submission.
Weighing particles above MegaDalton mass range has been a persistent challenge in commercial mass spectrometry. Recently, nanoelectromechanical systems-based mass spectrometry (NEMS-MS) has shown remarkable performance in this mass range, especially with the advance of performing mass spectrometry under entirely atmospheric conditions. This advance reduces the overall complexity and cost, while improving the limit of detection. However, this technique required the tracking of two mechanical modes, and the accurate knowledge of mode shapes which may deviate from their ideal values especially due to air damping. Here, we used a NEMS architecture with a central platform, which enables the calculation of mass by single mode measurements. Experiments were conducted using polystyrene and gold nanoparticles to demonstrate the successful acquisition of mass spectra using a single mode, with improved areal capture efficiency. This advance represents a step forward in NEMS-MS, bringing it closer to becoming a practical application for mass sensing of nanoparticles.
Assessing the presence of chemical, biological, radiological and nuclear threats is a crucial task which is usually dealt with by analyzing the presence of spectral features in a measured absorption profile. The use of quantum light allows to perform these measurements remotely without compromising the measurement accuracy through ghost spectrometry. However, in order to have sufficient signal-to-noise ratio, it is typically required to wait long acquisition times, hence subtracting to the benefits provided by remote sensing. In many instances, though, reconstructing the full spectral lineshape of an object is not needed and the interest lies in discriminating whether a spectrally absorbing object may be present or not. Here we show that this task can be performed fast and accurately through ghost spectrometry by comparing the low resources measurement with a reference. We discuss the experimental results obtained with different samples and complement them with simulations to explore the most common scenarios.
Tendencies in five main branches of atomic spectrometry (absorption, emission, mass, fluorescence and ionization spectrometry) are considered. The first three techniques are the most widespread and universal, with the best sensitivity attributed to atomic mass spectrometry. In the direct elemental analysis of solid samples, the leading roles are now conquered by laser-induced breakdown and laser ablation mass spectrometry, and the related techniques with transfer of the laser ablation products into inductively-coupled plasma. Advances in design of diode lasers and optical parametric oscillators promote developments in fluorescence and ionization spectrometry and also in absorption techniques where uses of optical cavities for increased effective absorption pathlength are expected to expand. Prospects for analytical instrumentation are seen in higher productivity, portability, miniaturization, incorporation of advanced software, automated sample preparation and transition to the multifunctional modular architecture. Steady progress and growth in applications of plasma- and laser-based methods are observed. An interest towards the absolute (standardless) analysis has revived, particu