Automated agents, powered by Large language models (LLMs), are emerging as the go-to tool for querying information. However, evaluation benchmarks for LLM agents rarely feature natural questions that are both information-seeking and genuinely time-consuming for humans. To address this gap we introduce MoNaCo, a benchmark of 1,315 natural and time-consuming questions that require dozens, and at times hundreds, of intermediate steps to solve -- far more than any existing QA benchmark. To build MoNaCo, we developed a decomposed annotation pipeline to elicit and manually answer real-world time-consuming questions at scale. Frontier LLMs evaluated on MoNaCo achieve at most 61.2% F1, hampered by low recall and hallucinations. Our results underscore the limitations of LLM-powered agents in handling the complexity and sheer breadth of real-world information-seeking tasks -- with MoNaCo providing an effective resource for tracking such progress. The MoNaCo benchmark, codebase, prompts and models predictions are all publicly available at: https://tomerwolgithub.github.io/monaco
We construct an X-ray spectral model of reprocessing by a torus in an active galactic nucleus (AGN) with a Monte Carlo simulation framework MONACO. Two torus geometries of smooth and clumpy cases are considered and compared. In order to reproduce a Compton shoulder accurately, MONACO includes not only free electron scattering but also bound electron scattering. Raman and Reyleigh scattering are also treated, and scattering cross sections dependent on chemical states of hydrogen and helium are included. Doppler broadening by turbulence velocity can be implemented. Our model gives consistent results with other available models, such as MYTorus, except for differences due to different physical parameters and assumptions. We studied the dependence on torus parameters for Compton shoulder, and found that a intensity ratio of Compton shoulder to line core mainly depends on the column density, inclination angle, and metal abundance. For instance, an increase of metal abundance makes the Compton shoulder relatively weak. Also, shape of Compton shoulder depends on the column density. Furthermore, these dependences become different between smooth and clumpy cases. Then, we discuss the possib
We consider a non-interacting electron gas confined to a two-dimensional crystal by the action of a perpendicular magnetic field; in the one-particle approximation, the dynamics of the system is modelled by a spectrally gapped Bloch-Landau Hamiltonian. No commensurability condition is assumed between the magnetic flux per unit cell and the quantum of magnetic flux. We construct a non-equilibrium almost-stationary state (NEASS) which "dresses" the equilibrium Fermi projection on states below the spectral gap, and models the state of the system after the addition of a weak external electric field of strength $\varepsilon \ll 1$. Having in mind applications to the integer quantum Hall effect, we probe the response of a current operator in the direction transverse to that of the applied electric field, and show that the resulting current density in the NEASS is linear in $\varepsilon$, with no power-law corrections. The linear response coefficient, namely the Hall conductivity, is computed in terms of the equilibrium Fermi projection via the double-commutator formula, in accordance with the prediction from Kubo's linear response theory. Our results generalize the methods and findings o
We examine the statistical properties of a closed monetary economy with multi-aggregates interactions. Building upon Yakovenko's single-agent monetary model (Dragulescu and Yakovenko, 2000), we investigate the joint equilibrium distribution of aggregate size and wealth. By comparing theoretical and simulated data, we validate our findings and investigate the influence of both micro dynamics and macro characteristics of the system on the distribution. Additionally, we analyze the system's convergence towards equilibrium under various conditions. Our laboratory model may offer valuable insights into macroeconomic phenomena allowing to reproduce typical wealth distribution features observed in real economy.
Deep learning has revolutionized various fields by enabling highly accurate predictions and estimates. One important application is probabilistic prediction, where models estimate the probability of events rather than deterministic outcomes. This approach is particularly relevant and, therefore, still unexplored for segmentation tasks where each pixel in an image needs to be classified. Conventional models often overlook the probabilistic nature of labels, but accurate uncertainty estimation is crucial for improving the reliability and applicability of models. In this study, we applied Calibrated Probability Estimation (CaPE) to segmentation tasks to evaluate its impact on model calibration. Our results indicate that while CaPE improves calibration, its effect is less pronounced compared to classification tasks, suggesting that segmentation models can inherently provide better probability estimates. We also investigated the influence of dataset size and bin optimization on the effectiveness of calibration. Our results emphasize the expressive power of segmentation models as probability estimators and incorporate probabilistic reasoning, which is crucial for applications requiring p
Accurate precipitation forecasts are crucial for applications such as flood management, agricultural planning, water resource allocation, and weather warnings. Despite advances in numerical weather prediction (NWP) models, they still exhibit significant biases and uncertainties, especially at high spatial and temporal resolutions. To address these limitations, we explore uncertainty-aware deep learning models for post-processing daily cumulative quantitative precipitation forecasts to obtain forecast uncertainties that lead to a better trade-off between accuracy and reliability. Our study compares different state-of-the-art models, and we propose a variant of the well-known SDE-Net, called SDE U-Net, tailored to segmentation problems like ours. We evaluate its performance for both typical and intense precipitation events. Our results show that all deep learning models significantly outperform the average baseline NWP solution, with our implementation of the SDE U-Net showing the best trade-off between accuracy and reliability. Integrating these models, which account for uncertainty, into operational forecasting systems can improve decision-making and preparedness for weather-relate
We show how to derive an effective nonlinear dynamics, described by the Hartree-Fock equations, for fermionic quantum particles confined to a two-dimensional box and in presence of an external, uniform magnetic field. The derivation invokes the Dirac-Frenkel principle. We discuss the validity of this effective description with respect to the many-body Schrödinger dynamics for small times and for weak interactions, and also in regards to the number of particles.
We define a $\mathbb{Z}_2$-valued topological and gauge invariant associated to any 1-dimensional, translation-invariant topological insulator which satisfies either particle-hole symmetry or chiral symmetry. The invariant can be computed from the Berry phase associated to a suitable basis of Bloch functions which is compatible with the symmetries. We compute the invariant in the Su-Schrieffer-Heeger model for chiral symmetric insulators, and in the Kitaev model for particle-hole symmetric insulators. We show that in both cases the $\mathbb{Z}_2$ invariant predicts the existence of zero-energy boundary states for the corresponding truncated models.
Euclid will survey most of the accessible extragalactic sky with imaging and slitless spectroscopy observations, creating a unique spectroscopic catalog of galaxies with H$α$ line in emission that will map the Universe from $z=0.9$ to $1.8$. With low expected statistical errors, the error budget will likely be dominated by systematic errors related to uncertainties in the data and modelling. I will discuss the strategy that has been proposed to mitigate the expected systematic effects and propagate the uncertainty of mitigation to cosmological parameter errobars.
PINOCCHIO (PINpointing Orbit-Crossing Collapsed Hierarchical Objects) is a new algorithm for identifying dark matter halos in a given numerical realisation of the linear density field in a hierarchical universe (Monaco et al. 2001). It is shown that Lagrangian perturbation theory, and in particular its ellipsoidal truncation, is able to predict accurately the collapse, in the orbit-crossing sense, of generic mass elements. Some points that have undergone orbit crossing are assigned to the network of filaments and sheets that connects the halos; it is demonstrated that this network resembles closely that found in N-body simulations. The code generates a catalogue of dark matter halos with known mass, position, velocity, merging history and angular momentum. It is shown that the predictions of the code are very accurate when compared with the results of large N-body simulations that cover a range of cosmological models, box sizes and numerical resolutions. The mass function is recovered with an accuracy of better than 10 per cent in number density for halos with at least 30-50 particles. A similar accuracy is reached in the estimate of the correlation length r_0. The good agreement i
We study the interaction of the relaxation processes with the density fluctuations by molecular dynamics simulation of a flexible molecule model for o-terphenyl (oTP) in the liquid and supercooled phases. We find evidence, besides the structural relaxation, of a secondary vibrational relaxation whose characteristic time, few ps, is slightly temperature dependent. This i) confirms the result by Monaco et al. [Phys. Rev, E 62, 7595 (2000)] of the vibrational nature of the fast relaxation observed in Brillouin Light Scattering (BLS) experiments in oTP; and ii) poses a caveat on the interpretation of the BLS spectra of molecular systems in terms of a purely center of mass dynamics.
We provide a constructive proof of exponentially localized Wannier functions and related Bloch frames in 1- and 2-dimensional time-reversal symmetric (TRS) topological insulators. The construction is formulated in terms of periodic TRS families of projectors (corresponding, in applications, to the eigenprojectors on an arbitrary number of relevant energy bands), and is thus model-independent. The possibility to enforce also a TRS constraint on the frame is investigated. This leads to a topological obstruction in dimension 2, related to $\mathbb{Z}_2$ topological phases. We review several proposals for $\mathbb{Z}_2$ indices that distinguish these topological phases, including the ones by Fu--Kane [Phys. Rev. B 74 (2006), 195312], Prodan [Phys. Rev. B 83 (2011), 235115], Graf--Porta [Commun. Math. Phys. 324 (2013), 851] and Fiorenza--Monaco--Panati [Commun. Math. Phys., in press]. We show that all these formulations are equivalent. In particular, this allows to prove a geometric formula for the the $\mathbb{Z}_2$ invariant of 2-dimensional TRS topological insulators, originally indicated in [Phys. Rev. B 74 (2006), 195312], which expresses it in terms of the Berry connection and the
We have carried out a detailed experimental investigation of the static properties of planar Josephson tunnel junctions in presence of a uniform external magnetic field applied in an arbitrary orientation with respect to the barrier plane. We considered annular junctions, as well as rectangular junctions (having both overlap and cross-type geometries) with different barrier aspect ratios. It is shown how most of the experimental findings in an oblique field can be reproduced invoking the superposition principle to combine the classical behavior of electrically small junctions in an in-plane field together with the small junction behavior in a transverse field that we recently published [R. Monaco et al., J. Appl. Phys. vol 104, 023906 (2008)]. We explore the implications of these results in supposing systematic errors in previous experiments and in proposing new possible applications. We show that the presence of a transverse field may have important consequences, which could be either voluntarily exploited in applications or present an unwanted perturbation.
The shining of quasars is a likely trigger of massive galatic winds, able to remove most ISM from a star-forming spheroid. However, the mechanism responsible for the deposition of energy into the ISM is still unclear. Starting from a model for feedback in galaxy formation with a two-phase medium (Monaco 2004a), we propose that the perturbation induced by radiative heating from a quasar on the ISM triggers a critical change of feedback regime. In the feedback model, SNRs expanding in the hot and pressurized phase of a star-forming spheroid tipically become pressure-confined before the hot interior gas is able to cool. Whenever the evaporation flow due to radiative heating of the quasar is significant with respect to the star-formation rate, the SNRs reach the point where their interior gas cools before being confined, forming a thick cold shell. We show that in this conditions the shells percolate into a super-shell of cold gas that sweeps the whole galaxy. Radiation pressure then pushes the shell out of the galaxy. This self-limiting mechanism leads to a correlation between black hole and bulge masses. The insertion of a motivated wind trigger criterion in a hierarchical galaxy for
We measured the magnetic field dependence of the critical current of high quality Nb-based planar Josephson tunnel junctions in the presence of a controllable non-uniform field distribution. We found skewed and slowly changing magnetic diffraction patterns quite dissimilar from the Fraunhofer-like ones typical of a homogeneous field. Our findings can be well interpreted in terms of recent theoretical predictions [R. Monaco, J. Appl. Phys. vol.108, 033906 (2010)] for a uniform magnetic field gradient leading to Fresnel-like magnetic diffraction patterns. We also show that Fiske resonances can be suppressed by an asymmetric magnetic field profile.
Researchers discovered a way to reverse the direction of energy flow in turbulence, challenging a theory that has stood for more than 80 years。 The finding could open new possibilities for controlling ocean currents, improving medical technologies, and enhancing climate forecasting
A stunning spiral galaxy called Messier 88 is racing through the crowded Virgo Cluster on a journey that will dramatically reshape its future。 At its heart lies a supermassive black hole about 100 million times the mass of the Sun, while its graceful spiral arms sparkle with young star clusters and dark clouds of dust。 But as M88 plunges deeper int
As traditional chip miniaturization slows, researchers have found a way to pack more computing power into the same space by stacking silicon circuits in multiple layers。 The new process uses ultra-thin silicon membranes and low-temperature manufacturing techniques to overcome a major obstacle that has long blocked the production of true 3D chips
Astronomers have finally cracked the mystery behind a strange class of repeating cosmic signals that has baffled scientists for years。 Using Australia’s ASKAP radio telescope, researchers traced the bursts to a rare stellar duo in which a dense white dwarf is relentlessly siphoning material from a nearby red dwarf companion。 As the stolen matter sp