In financial predictions, the performance of machine learning models is often assessed by Rank IC, which is the Spearman rank correlation between the model predictions and the realized asset returns. Despite its wide adoption, most existing models are trained using regression losses or ranking objectives that may not align with Rank IC. We propose LambdaRankIC, a novel learning-to-rank approach that directly optimizes Rank IC. We circumvent the non-differentiability of the ranking operator by deriving the closed-form expression for the lambda gradients induced by the pairwise rank swaps, which enables efficient gradient-based optimization within the LambdaRank framework. We implement LambdaRankIC as a custom objective in XGBoost. Theoretically, we show that our approach optimizes an upper bound on Rank IC. We evaluate the proposed approach on both simulated and real-world financial data. In simulation studies, LambdaRankIC accurately recovers the true ranking structure in noiseless settings and consistently outperforms regression-based and NDCG-oriented ranking methods under low signal-to-noise ratios and heavy-tailed noise regimes. In empirical experiments using real market data,
Reinforcement learning (RL) with delays is challenging as sensory perceptions lag behind the actual events: the RL agent needs to estimate the real state of its environment based on past observations. State-of-the-art (SOTA) methods typically employ recursive, step-by-step forecasting of states. This can cause the accumulation of compounding errors. To tackle this problem, our novel belief estimation method, named Directly Forecasting Belief Transformer (DFBT), directly forecasts states from observations without incrementally estimating intermediate states step-by-step. We theoretically demonstrate that DFBT greatly reduces compounding errors of existing recursively forecasting methods, yielding stronger performance guarantees. In experiments with D4RL offline datasets, DFBT reduces compounding errors with remarkable prediction accuracy. DFBT's capability to forecast state sequences also facilitates multi-step bootstrapping, thus greatly improving learning efficiency. On the MuJoCo benchmark, our DFBT-based method substantially outperforms SOTA baselines. Code is available at https://github.com/QingyuanWuNothing/DFBT.
Neuromorphic event cameras possess superior temporal resolution, power efficiency, and dynamic range compared to traditional cameras. However, their asynchronous and sparse data format poses a significant challenge for conventional deep learning methods. Most existing methods either densify events into frames, sacrificing their sparse asynchronous nature, or use irregular models that are less compatible with GPU acceleration. Inspired by word-to-vector models, we propose event2vec, a novel representation that allows Transformers to process events directly. We demonstrate the effectiveness of event2vec on the DVS Gesture, ASL-DVS, and DVS-Lip benchmarks, showing that event2vec is remarkably parameter-efficient, features high throughput and low latency, and achieves high accuracy even with an extremely low number of events or low spatial resolutions. These results show that sparse asynchronous event data can be directly integrated into high-throughput Transformer architectures, offering an efficient paradigm for real-time neuromorphic vision. The code is provided at https://github.com/Intelligent-Computing-Lab-Panda/event2vec.
Zero-shot reinforcement learning (RL) methods aim at instantly producing a behavior for an RL task in a given environment, from a description of the reward function. These methods are usually tested by evaluating their average performance on a series of downstream tasks. Yet they cannot be trained directly for that objective, unless the distribution of downstream tasks is known. Existing approaches either use other learning criteria [BBQ+ 18, TRO23, TO21, HDB+ 19], or explicitly set a prior on downstream tasks, such as reward functions given by a random neural network [FPAL24]. Here we prove that the zero-shot RL loss can be optimized directly, for a range of non-informative priors such as white noise rewards, temporally smooth rewards, ``scattered'' sparse rewards, or a combination of those. Thus, it is possible to learn the optimal zero-shot features algorithmically, for a wide mixture of priors. Surprisingly, the white noise prior leads to an objective almost identical to the one in VISR [HDB+19], via a different approach. This shows that some seemingly arbitrary choices in VISR, such as Von Mises--Fisher distributions, do maximize downstream performance. This also suggests more
2MASS 1207 b, the first directly imaged planetary-mass companion, has been instrumental in advancing our understanding of exoplanets and brown dwarfs over the past 20 years. We have performed extensive atmospheric retrieval analyses of 2MASS 1207 b's JWST/NIRSpec spectrum using petitRADTRANS and a new atmospheric inhomogeneity framework, which characterizes homogeneous atmospheres, patchy clouds, cloud-free hot spots, or the combination of patchy clouds and spots. Among 24 retrieval runs with various assumptions, the most statistically preferred model corresponds to the patchy cloud scheme, with Teff$=1174^{+4}_{-3}$ K, log(g)=$3.62^{+0.03}_{-0.02}$ dex, and R$=1.399^{+0.008}_{-0.010}$ R$_{\rm Jup}$, along with near-solar atmospheric compositions of [M/H]$=-0.05\pm0.03$ dex and C/O$=0.440\pm0.012$. This model suggests ~9% of 2MASS 1207 b's atmosphere is covered by thin iron clouds, producing L-dwarf-like spectra, while the remaining 91% consists of thick silicate and iron clouds, emitting blackbody-like spectra. These thin-cloud and thick-cloud regions resemble Jupiter's belts and zones, respectively; this scenario is consistently supported by other retrieval runs incorporating inh
Bohmian mechanics, also referred to as the de Broglie-Bohm pilot-wave theory, represents a deterministic and nonlocal interpretation of quantum mechanics. Since its origination in 1927, despite many attempts, reconciling it with relativistic theory and verification of its relativistic effects have remained elusive. Here, we report a direct observation of relativistic characteristics of Bohmian mechanics. We reconstruct the relativistic Bohmian trajectories of single photons utilizing weak measurement techniques in a double-slit interferometer, unveiling a fundamental aspect of relativistic Bohmian mechanics. We investigate the effective squared mass density of single photons, revealing its negative values in the destructive regions -- a phenomenon directly links to the tachyonic behavior in relativistic Bohmian mechanics. The continuity equations given by both the Klein-Gordon equation and Schrödinger's equation are experimentally examined. Our result indicates that within the framework of relativity, the conservation of energy holds true, whereas the conservation of particle number for a free scalar field no longer holds. The emergence of previously unobserved phenomena in the ext
Continuing recent studies of both the hereditary and super properties of certain classes of Abelian groups, we explore in-depth what is the situation in the quite large class consisting of directly finite Abelian groups. Trying to connect some of these classes, we specifically succeeded to prove the surprising criteria that a relatively Hopfian group is hereditarily only when it is extended Bassian, as well as that, a relatively Hopfian group is super only when it is extended Bassian. In this aspect, additional relevant necessary and sufficient conditions in a slightly more general context are also proved.
Synthesizability in generative molecular design remains a pressing challenge. Existing methods to assess synthesizability span heuristics-based methods, retrosynthesis models, and synthesizability-constrained molecular generation. The latter has become increasingly prevalent and proceeds by defining a set of permitted actions a model can take when generating molecules, such that all generations are anchored in "synthetically-feasible" chemical transformations. To date, retrosynthesis models have been mostly used as a post-hoc filtering tool as their inference cost remains prohibitive to use directly in an optimization loop. In this work, we show that with a sufficiently sample-efficient generative model, it is straightforward to directly optimize for synthesizability using retrosynthesis models in goal-directed generation. Under a heavily-constrained computational budget, our model can generate molecules satisfying a multi-parameter drug discovery optimization task while being synthesizable, as deemed by the retrosynthesis model.
Spiking neural networks (SNNs) are brain-inspired energy-efficient models that encode information in spatiotemporal dynamics. Recently, deep SNNs trained directly have shown great success in achieving high performance on classification tasks with very few time steps. However, how to design a directly-trained SNN for the regression task of object detection still remains a challenging problem. To address this problem, we propose EMS-YOLO, a novel directly-trained SNN framework for object detection, which is the first trial to train a deep SNN with surrogate gradients for object detection rather than ANN-SNN conversion strategies. Specifically, we design a full-spike residual block, EMS-ResNet, which can effectively extend the depth of the directly-trained SNN with low power consumption. Furthermore, we theoretically analyze and prove the EMS-ResNet could avoid gradient vanishing or exploding. The results demonstrate that our approach outperforms the state-of-the-art ANN-SNN conversion methods (at least 500 time steps) in extremely fewer time steps (only 4 time steps). It is shown that our model could achieve comparable performance to the ANN with the same architecture while consuming
In this paper, we present the Directly Denoising Diffusion Model (DDDM): a simple and generic approach for generating realistic images with few-step sampling, while multistep sampling is still preserved for better performance. DDDMs require no delicately designed samplers nor distillation on pre-trained distillation models. DDDMs train the diffusion model conditioned on an estimated target that was generated from previous training iterations of its own. To generate images, samples generated from the previous time step are also taken into consideration, guiding the generation process iteratively. We further propose Pseudo-LPIPS, a novel metric loss that is more robust to various values of hyperparameter. Despite its simplicity, the proposed approach can achieve strong performance in benchmark datasets. Our model achieves FID scores of 2.57 and 2.33 on CIFAR-10 in one-step and two-step sampling respectively, surpassing those obtained from GANs and distillation-based models. By extending the sampling to 1000 steps, we further reduce FID score to 1.79, aligning with state-of-the-art methods in the literature. For ImageNet 64x64, our approach stands as a competitive contender against le
Entanglement plays a fundamental role in quantum physics and information processing. Here, we develop an unbiased estimator for mixed-state entanglement in the few-shot scenario and directly estimate it using random unitary evolution in a photonic system. As a supplement to traditional projective measurements, we incorporate Bell measurements on qubit-pairs, enriching the previous randomized measurement scheme, which is no-go in this task with only local unitary evolution. The scheme is scalable to n-qubits via Bell measurements on qubit-pairs. The estimator can be derived directly from a few consecutive outcomes while exhibiting greater robustness to system errors and noise compared to schemes based on shadow estimation. We find that, under a fixed measurement resource, the way with more versatile measurement settings with fewer repeats per setting is more efficient. Our protocol and demonstration advance the direct characterization of quantum states in practice.
Prioritized Experience Replay (PER) enables the model to learn more about relatively important samples by artificially changing their accessed frequencies. However, this non-uniform sampling method shifts the state-action distribution that is originally used to estimate Q-value functions, which brings about the estimation deviation. In this article, an novel off policy reinforcement learning training framework called Directly Attention Loss Adjusted Prioritized Experience Replay (DALAP) is proposed, which can directly quantify the changed extent of the shifted distribution through Parallel Self-Attention network, so as to accurately compensate the error. In addition, a Priority-Encouragement mechanism is designed simultaneously to optimize the sample screening criterion, and further improve the training efficiency. In order to verify the effectiveness and generality of DALAP, we integrate it with the value-function based, the policy-gradient based and multi-agent reinforcement learning algorithm, respectively. The multiple groups of comparative experiments show that DALAP has the significant advantages of both improving the convergence rate and reducing the training variance.
The path tracing method generates incoherent rays by randomly sampling directions. This randomness makes it unsuitable for modern processor architectures that rely on coherence to achieve optimal performance. Many efforts have been made to address this issue by reordering rays based on their origin, end, or direction to enhance coherence. However, a drawback of reordering methods is the need to encode and sort rays before tracing, introducing additional overhead. We propose a technique to generate coherent rays directly by reusing the direction. Additionally, we introduce an interleaved reuse domain partition method to mitigate the impact of sampling correlation resulting from direction reuse. We demonstrate the effectiveness of our approach across various scenes, establishing its superiority over reordering methods.
We quantify the mechanisms for manganese (Mn) diffusion through graphene in Mn/graphene/Ge (001) and Mn/graphene/GaAs (001) heterostructures for samples prepared by graphene layer transfer versus graphene growth directly on the semiconductor substrate. These heterostructures are important for applications in spintronics; however, challenges in synthesizing graphene directly on technologically important substrates such as GaAs necessitate layer transfer and anneal steps, which introduce defects into the graphene. \textit{In-situ} photoemission spectroscopy measurements reveal that Mn diffusion through graphene grown directly on a Ge (001) substrate is 1000 times lower than Mn diffusion into samples without graphene ($D_{gr,direct} \sim 4\times10^{-18}$cm$^2$/s, $D_{no-gr} \sim 5 \times 10^{-15}$ cm$^2$/s at 500$^\circ$C). Transferred graphene on Ge suppresses the Mn in Ge diffusion by a factor of 10 compared to no graphene ($D_{gr,transfer} \sim 4\times10^{-16}cm^2/s$). For both transferred and directly-grown graphene, the low activation energy ($E_a \sim 0.1-0.5$ eV) suggests that Mn diffusion through graphene occurs primarily at graphene defects. This is further confirmed as the d
The recent experimental advances in capacitively coupled singlet-triplet qubits, particularly the demonstration of entanglement, opens the question of what type of entangling gates the system's Hamiltonian can produce directly via a single square pulse. We address this question by considering the system's Hamiltonian from first principles and using the representation of its nonlocal properties in terms of local invariants. In the analysis we include the three different ways in which the system can be biased and their effect on the generation of entangling gates. We find that, in one of the possible biasing modes, the Hamiltonian has an especially simple form, which can directly generate a wide range of different entangling gates including the iSWAP gate. Moreover, using the complete form of the Hamiltonian we find that, for any biasing mode, a CNOT gate can be generated directly.
The majority of astronomers and physicists accept the reality of dark energy and also believe that it can only be studied indirectly through observation of the motions of stars and galaxies. In this paper I open the experimental question of whether it is possible to directly detect dark energy through the presence of dark energy density. Two thirds of this paper outlines the major aspects of dark energy density as now comprehended by the astronomical and physics community. The final third summarizes various proposals for direct detection of dark energy density or its possible effects. At this time I do not have a fruitful answer to the question: Can the Existence of Dark Energy Be Directly Detected?
Recently, Teachey, Kipping, and Schmitt (2018) reported the detection of a candidate exomoon, tentatively designated Kepler-1625b I, around a giant planet in the Kepler field. The candidate exomoon would be about the size and mass of Neptune, considerably larger than any moon in our Solar System, and if confirmed, would be the first in a new class of giant moons or binary planets. Motivated by the large mass ratio in the Kepler-1625b planet and satellite system, we investigate the detectability of similarly massive exomoons around directly imaged exoplanets via Doppler spectroscopy. The candidate moon around Kepler-1625b would induce a radial velocity signal of about 200 m/s on its host planet, large enough that similar moons around directly imaged planets orbiting bright, nearby stars might be detected with current or next generation instrumentation. In addition to searching for exomoons, a radial velocity survey of directly imaged planets could reveal the orientations of the planets' spin axes, making it possible to identify Uranus analogs.
In scientific computing, it is time-consuming to calculate an inverse operator ${\mathscr A}^{-1}$ of a differential equation ${\mathscr A}\varphi = f$, especially when ${\mathscr A}$ is a highly nonlinear operator. In this paper, based on the homotopy analysis method (HAM), a new approach, namely the method of directly defining inverse mapping (MDDiM), is proposed to gain analytic approximations of nonlinear differential equations. In other words, one can solve a nonlinear differential equation ${\mathscr A}\varphi = f$ by means of directly defining an inverse mapping $\mathscr J$, i.e. without calculating any inverse operators. Here, the inverse mapping $\mathscr J$ is even unnecessary to be explicitly expressed in a differential form, since "mapping" is a more general concept than "differential operator". To guide how to directly define an inverse mapping $\mathscr J$, some rules are provided. Besides, a convergence theorem is proved, which guarantees that a convergent series solution given by the MDDiM must be a solution of problems under consideration. In addition, three nonlinear differential equations are used to illustrate the validity and potential of the MDDiM, and especi
An algebra $A$ is said to be directly finite if each left invertible element in the (conditional) unitization of $A$ is right invertible. We show that the reduced group ${\rm C}^\ast$-algebra of a unimodular group is directly finite, extending known results for the discrete case. We also investigate the corresponding problem for algebras of $p$-pseudofunctions, showing that these algebras are directly finite if $G$ is amenable and unimodular, or unimodular with the Kunze--Stein property. An exposition is also given of how existing results from the literature imply that $L^1(G)$ is not directly finite when $G$ is the affine group of either the real or complex line.
Variational data assimilation and deep learning share many algorithmic aspects in common. While the former focuses on system state estimation, the latter provides great inductive biases to learn complex relationships. We here design a hybrid architecture learning the assimilation task directly from partial and noisy observations, using the mechanistic constraint of the 4DVAR algorithm. Finally, we show in an experiment that the proposed method was able to learn the desired inversion with interesting regularizing properties and that it also has computational interests.