Reinforcement learning (RL) has emerged as a key paradigm for aligning and optimizing large language models (LLMs). Standard approaches treat the LLM as the policy and apply RL directly over the full vocabulary space. However, this formulation includes the massive tail of contextually irrelevant tokens in the action space, which could distract the policy from focusing on decision-making among the truly reasonable tokens. In this work, we verify that valid reasoning paths could inherently concentrate within a low-rank subspace. Based on this insight, we introduce Reinforcement Learning with Promising Tokens (RLPT), a framework that mitigates the action space issue by decoupling strategic decision-making from token generation. Specifically, RLPT leverages the semantic priors of the base model to identify a dynamic set of promising tokens and constrains policy optimization exclusively to this refined subset via masking. Theoretical analysis and empirical results demonstrate that RLPT effectively reduces gradient variance, stabilizes the training process, and improves sample efficiency. Experiment results on math, coding, and telecom reasoning show that RLPT outperforms standard RL bas
Both digital economy and digital technology researchers increasingly recognize the need to better address the role that artificial intelligence (AI) plays in shaping the evolution of the environmental, social and governance aspects of development. It appears that sustainability and AI research converge on the features of wicked problems that are complex, interconnected and dynamic. Building off such convergence, this article aims to map out the necessary, challenging, and promising intersections by providing an overview of the state of art research. Based on 541 bibliographic data collected from the Web of Science (WoS) database, the findings reveal the increasingly central body of work on green and sustainable science and technology in bridging various disciplines, main journals and key topics and concepts. The findings reveal how such interactions can be necessary, challenging, and promising. The article concludes with few general arguments regarding how to diversify and expand the community of practice regarding AI for sustainable development, especially in the areas of expected AI application areas and institutions.
BaCd2P2 (BCP) has been recently identified as a new solar absorber with promising optoelectronic properties. This work demonstrates that, despite having a low precursor purity (98.90% to 99.95%), synthesized BCP samples exhibit a promising photoconductive carrier lifetime up to 300 ns, an implied open-circuit voltage exceeding 1 V, and photoluminescence quantum yield in the order of 0.2%, comparable to a high-purity single-crystalline GaAs wafer. To better understand the underlying mechanisms of BCP's promising properties, its tolerance to intrinsic defects and extrinsic impurities is investigated using first-principles defect modeling and compared with that of the well-studied GaAs. The results show that the nonradiative recombination rates induced by dominant deep-level intrinsic antisite defects are lower in BCP than in GaAs under typical growth conditions. Further exploration of the impact of transition metal impurities in the raw materials used to make BCP and impurities introduced during its synthesis shows that most of these do not form deep-level nonradiative recombination centers. As an impurity-tolerant counterpart of GaAs, BCP demonstrates great potential to improve the
The high configurability of modern software systems has made configuration tuning a crucial step for assuring system performance, e.g., latency or throughput. However, given the expensive measurements, large configuration space, and rugged configuration landscape, existing tuners suffer ineffectiveness due to the difficult balance of budget utilization between exploring uncertain regions (for escaping from local optima) and exploiting guidance of known good configurations (for fast convergence). The root cause is that we lack knowledge of where the promising regions lay, which also causes challenges in the explainability of the results. In this paper, we propose PromiseTune that tunes configuration guided by causally purified rules. PromiseTune is unique in the sense that we learn rules, which reflect certain regions in the configuration landscape, and purify them with causal inference. The remaining rules serve as approximated reflections of the promising regions, bounding the tuning to emphasize these places in the landscape. This, as we demonstrate, can effectively mitigate the impact of the exploration and exploitation trade-off. Those purified regions can then be paired with t
To improve the ability of the large language model (LLMs) to tackle complex reasoning problems, chain-of-thoughts (CoT) methods were proposed to guide LLMs to reason step-by-step, enabling problem solving from simple to complex. State-of-the-art methods for generating such a chain involve interactive collaboration, where the learner generates candidate intermediate thoughts, evaluated by the LLM, guiding the generation of subsequent thoughts. However, a widespread yet understudied problem is that the evaluation from the LLM is typically noisy and unreliable, potentially misleading the generation process in selecting promising intermediate thoughts. In this paper, motivated by Vapnik's principle, we use pairwise-comparison evaluation instead of point-wise scoring to search for promising intermediate thoughts with the noisy feedback from the LLM. In each round, we randomly pair intermediate thoughts and directly prompt the LLM to select the more promising one from each pair, allowing us to identify the most promising thoughts through an iterative process. To further alleviate the noise in the comparison, we incorporate techniques from ensemble learning and dueling bandits, proposing
Despite the promising capability of multimodal foundation models, their application to the generation of meteorological products and services remains nascent. To accelerate aspiration and adoption, we explore the novel use of a vision language model for writing the iconic Shipping Forecast text directly from video-encoded gridded weather data. These early results demonstrate promising scalable technological opportunities for enhancing production efficiency and service innovation within the weather enterprise and beyond.
Accurate and verifiable large language model (LLM) simulations of human research subjects promise an accessible data source for understanding human behavior and training new AI systems. However, results to date have been limited, and few social scientists have adopted this method. In this position paper, we argue that the promise of LLM social simulations can be achieved by addressing five tractable challenges. We ground our argument in a review of empirical comparisons between LLMs and human research subjects, commentaries on the topic, and related work. We identify promising directions, including context-rich prompting and fine-tuning with social science datasets. We believe that LLM social simulations can already be used for pilot and exploratory studies, and more widespread use may soon be possible with rapidly advancing LLM capabilities. Researchers should prioritize developing conceptual models and iterative evaluations to make the best use of new AI systems.
Low-Earth orbit (LEO) satellite systems have been deemed a promising key enabler for current 5G and the forthcoming 6G wireless networks. Such LEO satellite constellations can provide worldwide three-dimensional coverage, high data rate, and scalability, thus enabling truly ubiquitous connectivity. On the other hand, another promising technology, reconfigurable intelligent surfaces (RISs), has emerged with favorable features, such as flexible deployment, cost & power efficiency, less transmission delay, noise-free nature, and in-band full-duplex structure. LEO satellite networks have many practical imperfections and limitations; however, exploiting RISs has been shown to be a potential solution to overcome these challenges. Particularly, RISs can enhance link quality, reduce the Doppler shift effect, and mitigate inter-/intra beam interference. In this article, we delve into exploiting RISs in LEO satellite networks. First, we present a holistic overview of LEO satellite communication and RIS technology, highlighting potential benefits and challenges. Second, we describe promising usage scenarios and applications in detail. Finally, we discuss potential future directions and ch
Sampling-based path planning algorithms usually implement uniform sampling methods to search the state space. However, uniform sampling may lead to unnecessary exploration in many scenarios, such as the environment with a few dead ends. Our previous work proposes to use the promising region to guide the sampling process to address the issue. However, the predicted promising regions are often disconnected, which means they cannot connect the start and goal state, resulting in a lack of probabilistic completeness. This work focuses on enhancing the connectivity of predicted promising regions. Our proposed method regresses the connectivity probability of the edges in the x and y directions. In addition, it calculates the weight of the promising edges in loss to guide the neural network to pay more attention to the connectivity of the promising regions. We conduct a series of simulation experiments, and the results show that the connectivity of promising regions improves significantly. Furthermore, we analyze the effect of connectivity on sampling-based path planning algorithms and conclude that connectivity plays an essential role in maintaining algorithm performance.
Forthcoming missions probing the absolute intensity of the CMB are expected to be able to measure spectral distortions, which are deviations from its blackbody distribution. As cosmic inflation can induce spectral distortions, these experiments offer a possibility to further test the various promising inflationary proposals, whose predictions need to be carefully determined. After numerically fitting all inflationary observables to match current observations, we compute the predicted spectral distortions of various promising single and multifield inflationary models. The predictions of single-field inflationary models display deviations between 0.5% and 20% with respect to the standard cosmological model in the observable window, where multi-natural and axion-monodromy inflation stand out in this respect. In the case of multifield inflation, we observe a richer structure of the power spectrum, which, in the case of so-called hybrid attractors, yields spectral distortions about 100 times more intense than the standard signal. These observations open up questions about the relation among our results and other cosmological observables that are also to be probed soon, such as the produ
The promising field of organic electronics has ushered in a new era of biosensing technology, offering a promising frontier for applications in both medical diagnostics and environmental monitoring. This review paper provides a comprehensive overview of the remarkable progress and potential of organic electronics in biosensing applications. It explores the multifaceted aspects of organic materials and devices, highlighting their unique advantages, such as flexibility, biocompatibility, and low-cost fabrication. The paper delves into the diverse range of biosensors enabled by organic electronics, including electrochemical, optical, piezoelectric, and thermo sensors, showcasing their versatility in detecting biomolecules, pathogens, and environmental pollutants. Furthermore, integrating organic biosensors into wearable devices and the Internet of Things (IoT) ecosystem is discussed, offering real-time, remote, and personalized monitoring solutions. The review also addresses the current challenges and prospects of organic biosensing, emphasizing the potential for breakthroughs in personalized medicine, environmental sustainability, and the advancement of human health and well-being.
In the quest for efficient and cost-effective photovoltaic absorber materials beyond silicon, considerable attention has been directed toward exploring alternatives. One such material, zincblende-derived Cu2ZnSnS4 (CZTS), has shown promise due to its ideal band-gap size and high absorption coefficient. However, challenges such as structural defects and secondary phase formation have hindered its development. In this study, we examine the potential of another compound Cu2ZnSnO4 (CZTO) with a similar composition to CZTS as a promising alternative. Employing ab initio density function theory (DFT) calculations in combination with an evolutionary structure prediction algorithm, we identify that the crystalline phase of the delafossite structure is the most stable among the 900 (meta)stable CZTO. Its thermodynamic stability at room temperature is also confirmed by the molecular dynamics study. Excitingly, this new phase of CZTO displays a direct band gap where the dipole-allowed transition occurs, making it a strong candidate for efficient light absorption. Furthermore, the estimation of spectroscopic limited maximum efficiency (SLME) directly demonstrates the high potential of delafoss
Unmanned aerial vehicle (UAV) swarm enabled edge computing is envisioned to be promising in the sixth generation wireless communication networks due to their wide application sensories and flexible deployment. However, most of the existing works focus on edge computing enabled by a single or a small scale UAVs, which are very different from UAV swarm-enabled edge computing. In order to facilitate the practical applications of UAV swarm-enabled edge computing, the state of the art research is presented in this article. The potential applications, architectures and implementation considerations are illustrated. Moreover, the promising enabling technologies for UAV swarm-enabled edge computing are discussed. Furthermore, we outline challenges and open issues in order to shed light on the future research directions.
Consensus is unnecessary when the truth is available. In this paper, we present a new perspective of rebuilding the blockchain without consensus. When the consensus phase is eliminated from a blockchain, transactions could be canonized quickly using a well-defined universal rule without consuming hashing power. Thus, the transactions per second(TPS) metric of such the consensusless blockchain can be largely boosted. Although consensus blockchain is promising, several technical challenges are also crucial. For example, double-spending attacks and frequent forking events must be prevented, the credit of block's minting must be carefully defined, and etc. To address those technical challenges, we propose several solutions for our consensusless blockchain (CB), including a naive monotonic scoring mechanism to calculate the ranking of each block in the chain, and a two-stage witness mechanism to add new blocks. The proposed CB chain is promising to offer a simplified and equipment-cheap infrastructure for rich real-world decentralized applications.
Deoxyribonucleic Acid (DNA), with its high density and long durability, is a promising storage medium for long-term archival storage and has attracted much attention. Several studies have verified the feasibility of using DNA for archival storage with a small amount of data. However, the achievable storage capacity of DNA as archival storage has not been comprehensively investigated yet. Theoretically, the DNA storage density is about 1 exabyte/mm3 (109 GB/mm3). However, according to our investigation, DNA storage tube capacity based on the current synthesizing and sequencing technologies is only at hundreds of Gigabytes due to the limitation of multiple bio and technology constraints. This paper identifies and investigates the critical factors affecting the single DNA tube capacity for archival storage. Finally, we suggest several promising directions to overcome the limitations and enhance DNA storage capacity.
The high-throughput (HT) computational method is a useful tool to screen high performance functional materials. In this work, using the deformation potential method under the single band model, we evaluate the carrier relaxation time and establish an electrical descriptor (\c{hi}) characterized by the carrier effective masses based on the simple rigid band approximation. The descriptor (\c{hi}) can be used to reasonably represent the maximum power factor without solving the electron Boltzmann transport equation. Additionally, the Grüneisen parameter (γ), a descriptor of the lattice anharmonicity and lattice thermal conductivity, is efficiently evaluated using the elastic properties, omitting the costly phonon calculations. Applying two descriptors (\c{hi} and γ) to binary chalcogenides, we HT compute 243 semiconductors and screen 50 promising thermoelectric materials. For these theoretically determined compounds, we successfully predict some previously experimentally and theoretically investigated promising thermoelectric materials. Additionally, 9 p-type and 14 n-type previously unreported binary chalcogenides are also predicted as promising thermoelectric materials. Our work prov
Singlet fission is a form of multiple exciton generation which occurs in organic chromophores when a high energy singlet exciton separates into two lower energy triplet excitons, each with approximately half the singlet energy. Since this process is spin-allowed it can proceed on an ultrafast timescale of less than several picoseconds, outcompeting most other loss mechanisms and reaching quantitative yields approaching 200%. Due to this high quantum efficiency, the singlet fission process shows promise as a means of reducing thermalisation losses in photovoltaic cells. This would potentially allow for efficiency improvements beyond the thermodynamic limit in a single junction cell. Efforts to incorporate this process into solar photovoltaic cells have spanned a wide range of device structures over the past decade. In this review we compare and categorise these attempts in order to assess the state of the field and identify the most promising avenues of future research and development.
Massive multiple-input multiple-output (MIMO) is a promising technology for enabling cellular-connected unmanned aerial vehicle (UAV) communications in the future. Equipped with full-dimensional large arrays, ground base stations (GBSs) can apply adaptive fine-grained three-dimensional (3D) beamforming to mitigate the strong interference between high-altitude UAVs and low-altitude terrestrial users, thus significantly enhancing the network spectral efficiency. However, the performance gain of massive MIMO critically depends on the accurate channel state information (CSI) of both UAVs and terrestrial users at the GBSs, which is practically difficult to achieve due to UAV-induced pilot contamination and UAV's high mobility in 3D. Moreover, the increasingly popular applications relying on a large group of coordinated UAVs or UAV swarm as well as the practical hybrid GBS beamforming architecture for massive MIMO further complicate the pilot contamination and channel/beam tracking problems. In this article, we provide an overview of the above challenging issues, propose new solutions to cope with them, and discuss about promising directions for future research. Preliminary simulation re
Large language models are increasingly deployed as autonomous agents in multi-agent settings where they communicate intentions and take consequential actions with limited human oversight. A critical safety question is whether agents that publicly commit to actions break those promises when they can privately deviate, and what the consequences are for both themselves and the collective. We study deception as a deviation from a publicly announced action in one-shot normal-form games, classifying each deviation by its effect on individual payoff and collective welfare into four categories: win-win, selfish, altruistic, and sabotaging. By exhaustively enumerating announcement profiles across six canonical games, nine frontier models, and varying group sizes, we identify all opportunities for each deviation type and measure how often agents exploit them. Across all settings, agents deviate from promises in approximately 56.6% of scenarios, but the character of deception varies substantially across models even at similar overall rates. Most critically, for the majority of the models, promise-breaking occurs without verbalized awareness of the fact that they are breaking promises.
A Boolean predicate $A$ is defined to be promise-useful if $\operatorname{PCSP}(A,B)$ is tractable for some non-trivial $B$ and otherwise it is promise-useless. We initiate investigations of this notion and derive sufficient conditions for both promise-usefulness and promise-uselessness (assuming $\text{P} e \text{NP}$). While we do not obtain a complete characterization, our conditions are sufficient to classify all predicates of arity at most $4$ and almost all predicates of arity $5$. We also derive asymptotic results to show that for large arities a vast majority of all predicates are promise-useless. Our results are primarily obtained by a thorough study of the "Promise-SAT" problem, in which we are given a $k$-SAT instance with the promise that there is a satisfying assignment for which the literal values of each clause satisfy some additional constraint. The algorithmic results are based on the basic LP + affine IP algorithm of Brakensiek et al. (SICOMP, 2020) while we use a number of novel criteria to establish NP-hardness.