共找到 20 条结果
We show that a rational agent with true and refinable knowledge of events cannot know if she knows everything or not. This epistemic limitation is not resolved by introspection about tautologies or by learning about new events.
A goal-conditioned reinforcement learning agent exploring an environment will see a wealth of information throughout a trajectory, most of which is discarded when only performing on-policy updates with respect to the commanded goal. All-goals learning, where each transition is used for learning off-policy with respect to every goal, allows agents to extract maximal information, however it is usually computationally infeasible when done via naive relabelling. This can be overcome by jointly outputting values and actions for every goal at once, allowing for efficient, parallel all-goals updates with a single pass through the network, in a process we call Learning Everything all at Once (LEO). We show that this approach significantly outperforms other methods on goal-conditioned Craftax and is competitive with existing baselines on continuous control environments, while achieving a >250x speed-up compared to all-goals relabelling. We then go on to show that this approach can be made even more powerful by using LEO as a teacher network, rather than a direct actor. We hope that, by unlocking all-goals learning at scale, LEO can serve as a useful tool for RL practitioners in complex e
Recent agentic language models increasingly need to interact with real-world environments that contain tightly intertwined visual and textual information, often through raw camera pixels rather than separately processed images and tokenized text. This shift highlights the need for a unified perception paradigm. To investigate this idea, we explore Perceive Everything as Pixels (PEAP) and introduce PixelWorld, a benchmark that renders natural-language, tabular, mathematical, and diagrammatic inputs into a shared pixel space. Experiments across multiple benchmarks show that PEAP achieves comparable performance to token-based approaches on semantic understanding tasks, suggesting that vision transformers can partially capture global textual semantics without explicit tokenization. In contrast, reasoning-intensive tasks such as mathematics and code show notable performance degradation, although Chain-of-Thought prompting helps mitigate this gap by compensating for missing symbolic structure. We further find that when visual and textual information are closely integrated, representing everything as pixels simplifies preprocessing and avoids cross-modal misalignment. PixelWorld thus prov
General relativity treats spacetime as dynamical and exhibits its breakdown at singularities. This failure is interpreted as evidence that quantum gravity is not a theory formulated within spacetime; instead, it must explain the very emergence of spacetime from deeper quantum degrees of freedom, thereby resolving singularities. Quantum gravity is therefore envisaged as an axiomatic structure, and algorithmic calculations acting on these axioms are expected to generate spacetime. However, Gödel's incompleteness theorems, Tarski's undefinability theorem, and Chaitin's information-theoretic incompleteness establish intrinsic limits on any such algorithmic programme. Together, these results imply that a wholly algorithmic "Theory of Everything" is impossible: certain facets of reality will remain computationally undecidable and can be accessed only through non-algorithmic understanding. We formalize this by constructing a "Meta-Theory of Everything" grounded in non-algorithmic understanding, showing how it can account for undecidable phenomena and demonstrating that the breakdown of computational descriptions of nature does not entail a breakdown of science. Because any putative simula
The Internet of Everything (IoE) represents an evolution of the Internet of Things (IoT) by integrating people, data, processes, and things into a unified intelligent ecosystem. IoE aims to enhance automation, decision-making, and service efficiency across multiple application domains such as smart cities, healthcare, industry, and next-generation wireless networks. This paper provides a structured overview of the IoE concept, its core components, architectural foundations, enabling technologies, and major research challenges. Finally, open research directions toward 6G-enabled intelligent IoE systems are discussed, with emphasis on scalability, security, privacy, and energy efficiency.
Alternative assets such as mines, power plants, or infrastructure projects are often large, heterogeneous bundles of resources, rights, and outputs whose value is difficult to trade or fractionalize under traditional frameworks. This paper proposes a novel two-tier tokenization architecture to enhance the liquidity and transparency of such complex assets. We introduce the concepts of Element Tokens and Everything Tokens: elemental tokens represent standardized, fully collateralized components of an asset (e.g., outputs, rights, or credits), while an everything token represents the entire asset as a fixed combination of those elements. The architecture enables both fine-grained partial ownership and integrated whole-asset ownership through a system of two-way convertibility. We detail the design and mechanics of this system, including an arbitrage mechanism that keeps the price of the composite token aligned with the net asset value of its constituents. Through illustrative examples in the energy and industrial sectors, we demonstrate that our approach allows previously illiquid, high-value projects to be fractionalized and traded akin to stocks or exchange-traded funds (ETFs). We d
Wireless communication between road users is essential for environmental perception, reasoning, and mission planning to enable fully autonomous vehicles, and thus improve road safety and transport efficiency. To enable collaborative driving, the concept of vehicle-to-Everything (V2X) has long been introduced to the industry. Within the last two decades, several communication standards have been developed based on IEEE 802.11p and cellular standards, namely Dedicated Short-Range Communication (DSRC), Intelligent Transportation System G5 (ITS-G5), and Cellular- and New Radio- Vehicle-to-Everything (C-V2X and NR-V2X). However, while there exists a high quantity of available publications concerning V2X and the analysis of the different standards, only few surveys exist that summarize these results. Furthermore, to our knowledge, no survey that provides an analysis about possible future trends and challenges for the global implementation of V2Xexists. Thus, this contribution provides a detailed survey on Vehicle-to-Everything communication standards, their performance, current and future applications, and associated challenges. Based on our research, we have identified several research
Background: Everything as Code (EaC) is an emerging paradigm aiming to codify all aspects of modern software systems. Despite its growing popularity, comprehensive industry standards and peer-reviewed research clarifying its scope and guiding its adoption remain scarce. Aims: This study systematically analyzes existing knowledge and perceptions of EaC, clarifies its scope and boundaries, and provides structured guidance for researchers and practitioners. Method: We conducted a large-scale multivocal literature review (MLR), synthesizing academic and grey literature sources. Findings were analyzed quantitatively and thematically. Based on this analysis, we developed a taxonomy and conceptual model of EaC, validated through collaboration with industry experts. Results: The resulting taxonomy comprises 25 distinct EaC practices organized into six layers based on industry awareness and functional roles. The conceptual model illustrates focus areas, overlaps, and interactions among these EaC practices within the software delivery lifecycle. Additionally, practical code examples demonstrating the implementation of these practices were developed in collaboration with industry experts. Con
The global 6G vision has taken its shape after years of international research and development efforts. This work culminated in ITU-R Recommendation on "IMT-2030 Framework". While the definition phase of technological requirements is currently ongoing, 3GPP standardization process on 6G networks is expected to start in 2025 and worldwide commercialization around 2029-2030. This article serves as a comprehensive guide to 6G by providing an overall vision, a contemporary survey of the main literature, and an informative tutorial-type presentation style. In our vision, 6G will be based on three fundamental elements: wireless, artificial intelligence, and Internet of Everything. Consequently, 6G can ultimately become the Intelligent Network of Everything while serving as an enabling platform for the next major disruption in mobile communication, called mobile intelligence. The potential of mobile intelligence is that anything can be made connected, intelligent, and aware of its environment. This will revolutionize the way how devices, systems, and applications are designed; how they operate and interact with humans and each other; and how they can be used for the benefit of people, soc
Concurrent with advancements in molecular communication (MC), bacterial communication is emerging as a key area of interest. Given the frequent use of bacteria in various MC models, it is essential to have a thorough grasp of their intrinsic communication, signaling, and engineering techniques. Although it is crucial to have a strong understanding of the communication background, the inherent biological variability of bacteria may introduce complexity. Thus, an in-depth understanding of bacteria and their communication is a must for improving and extending the models in which they are utilized. Furthermore, the emerging and evolving domain of bacterial computing provides an exciting opportunity for advancing applications in areas such as environmental monitoring and biological computing networks. By integrating the communication and sensing capabilities, bacterial computing offers a promising framework for enhancing the adaptability and responsiveness of bacteria. This paper provides a comprehensive review of bacterial communication and computing, illustrating their application and the link with the concept of the Internet of Everything (IoE). Through the analysis of these biologic
Achieving fully autonomous driving with enhanced safety and efficiency relies on vehicle-to-everything cooperative perception, which enables vehicles to share perception data, thereby enhancing situational awareness and overcoming the limitations of the sensing ability of individual vehicles. Vehicle-to-everything cooperative perception plays a crucial role in extending the perception range, increasing detection accuracy, and supporting more robust decision-making and control in complex environments. This paper provides a comprehensive survey of recent developments in vehicle-to-everything cooperative perception, introducing mathematical models that characterize the perception process under different collaboration strategies. Key techniques for enabling reliable perception sharing, such as agent selection, data alignment, and feature fusion, are examined in detail. In addition, major challenges are discussed, including differences in agents and models, uncertainty in perception outputs, and the impact of communication constraints such as transmission delay and data loss. The paper concludes by outlining promising research directions, including privacy-preserving artificial intellig
In the field of information extraction (IE), tasks across a wide range of modalities and their combinations have been traditionally studied in isolation, leaving a gap in deeply recognizing and analyzing cross-modal information. To address this, this work for the first time introduces the concept of grounded Multimodal Universal Information Extraction (MUIE), providing a unified task framework to analyze any IE tasks over various modalities, along with their fine-grained groundings. To tackle MUIE, we tailor a multimodal large language model (MLLM), Reamo, capable of extracting and grounding information from all modalities, i.e., recognizing everything from all modalities at once. Reamo is updated via varied tuning strategies, equipping it with powerful capabilities for information recognition and fine-grained multimodal grounding. To address the absence of a suitable benchmark for grounded MUIE, we curate a high-quality, diverse, and challenging test set, which encompasses IE tasks across 9 common modality combinations with the corresponding multimodal groundings. The extensive comparison of Reamo with existing MLLMs integrated into pipeline approaches demonstrates its advantages
Decentralized Metaverses, built on Web 3.0 and Web 4.0 technologies, have attracted significant attention across various fields. This innovation leverages blockchain, Decentralized Autonomous Organizations (DAOs), Extended Reality (XR) and advanced technologies to create immersive and interconnected digital environments that mirror the real world. This article delves into the Metaverse of Everything (MoE), a platform that fuses the Metaverse concept with the Internet of Everything (IoE), an advanced version of the Internet of Things (IoT) that connects not only physical devices but also people, data and processes within a networked environment. Thus, the MoE integrates generated data and virtual entities, creating an extensive network of interconnected components. This article seeks to advance current MoE, examining decentralization and the application of Opportunistic Edge Computing (OEC) for interactions with surrounding IoT devices and IoE entities. Moreover, it outlines the main challenges to guide researchers and businesses towards building a future cyber-resilient opportunistic MoE.
We present Track Anything Behind Everything (TABE), a novel pipeline for zero-shot amodal video object segmentation. Unlike existing methods that require pretrained class labels, our approach uses a single query mask from the first frame where the object is visible, enabling flexible, zero-shot inference. We pose amodal segmentation as generative outpainting from modal (visible) masks using a pretrained video diffusion model. We do not need to re-train the diffusion model to accommodate additional input channels but instead use a pretrained model that we fine-tune at test-time to allow specialisation towards the tracked object. Our TABE pipeline is specifically designed to handle amodal completion, even in scenarios where objects are completely occluded. Our model and code will all be released.
We introduce a new generative system called Edit Everything, which can take image and text inputs and produce image outputs. Edit Everything allows users to edit images using simple text instructions. Our system designs prompts to guide the visual module in generating requested images. Experiments demonstrate that Edit Everything facilitates the implementation of the visual aspects of Stable Diffusion with the use of Segment Anything model and CLIP. Our system is publicly available at https://github.com/DefengXie/Edit_Everything.
Ensemble everything everywhere is a defense to adversarial examples that was recently proposed to make image classifiers robust. This defense works by ensembling a model's intermediate representations at multiple noisy image resolutions, producing a single robust classification. This defense was shown to be effective against multiple state-of-the-art attacks. Perhaps even more convincingly, it was shown that the model's gradients are perceptually aligned: attacks against the model produce noise that perceptually resembles the targeted class. In this short note, we show that this defense is not robust to adversarial attack. We first show that the defense's randomness and ensembling method cause severe gradient masking. We then use standard adaptive attack techniques to reduce the defense's robust accuracy from 48% to 14% on CIFAR-100 and from 62% to 11% on CIFAR-10, under the $\ell_\infty$-norm threat model with $\varepsilon=8/255$.
Multi-object tracking (MOT) emerges as a pivotal and highly promising branch in the field of computer vision. Classical closed-vocabulary MOT (CV-MOT) methods aim to track objects of predefined categories. Recently, some open-vocabulary MOT (OV-MOT) methods have successfully addressed the problem of tracking unknown categories. However, we found that the CV-MOT and OV-MOT methods each struggle to excel in the tasks of the other. In this paper, we present a unified framework, Associate Everything Detected (AED), that simultaneously tackles CV-MOT and OV-MOT by integrating with any off-the-shelf detector and supports unknown categories. Different from existing tracking-by-detection MOT methods, AED gets rid of prior knowledge (e.g. motion cues) and relies solely on highly robust feature learning to handle complex trajectories in OV-MOT tasks while keeping excellent performance in CV-MOT tasks. Specifically, we model the association task as a similarity decoding problem and propose a sim-decoder with an association-centric learning mechanism. The sim-decoder calculates similarities in three aspects: spatial, temporal, and cross-clip. Subsequently, association-centric learning leverage
The Theory of Everything ($S_{\text{ToE}}$) seeks to unify all fundamental forces of nature, including quantum gravity, into a single theoretical framework. This theory would be defined internally using a set of axioms, and this paper proposes a set of axioms for any such theory. Furthermore, for such a theory, all scientific truth would be defined internally as consequences derivable from the rules of such a theory. This paper then examines the implications of Tarski's undefinability theorem on scientific truths derived from such axioms. We demonstrate that Tarski's theorem imposes limitations on any such formal system $S_{\text{ToE}}$. However, we also argue that the Lucas-Penrose argument suggests that non-algorithmic understanding can transcend these formal limitations.
Agriculture faces critical challenges from population growth, resource scarcity, and climate change, driving a shift toward advanced, technology-integrated farming. Mechanization has transformed agriculture, enhancing sustainability and crop productivity. Now, technologies like artificial intelligence (AI), robotics, biotechnology, blockchain, and the Internet of Things (IoT) are advancing precision agriculture. The concept of the Internet of Everything (IoE) has gained traction due to its holistic approach to integrating various IoT specializations, called IoXs with X referring to a specific domain. This paper explores the transformative role of IoE in agriculture, expanding beyond traditional IoT applications to integrate niche subdomains like molecular communication (MC), the Internet of Nano Things (IoNT), the Internet of Bio-Nano Things (IoBNT), designer phages, and the Internet of Fungus (IoF). Our study provides a detailed review of how these IoE subdomains, in conjunction with 6G, blockchain, and machine learning (ML), can enhance precision farming in areas like crop monitoring, resource management, and disease control. Unlike prior IoT centric reviews, this work uniquely f
Figurative and non-literal expressions are profoundly integrated in human communication. Visualising such expressions allow us to convey our creative thoughts, and evoke nuanced emotions. Recent text-to-image models like Stable Diffusion, on the other hand, struggle to depict non-literal expressions. Recent works primarily deal with this issue by compiling humanly annotated datasets on a small scale, which not only demands specialised expertise but also proves highly inefficient. To address this issue, we introduce ViPE: Visualise Pretty-much Everything. ViPE offers a series of lightweight and robust language models that have been trained on a large-scale set of lyrics with noisy visual descriptions that represent their implicit meaning. The synthetic visual descriptions are generated by GPT3.5 relying on neither human annotations nor images. ViPE effectively expresses any arbitrary piece of text into a visualisable description, enabling meaningful and high-quality image generation. We provide compelling evidence that ViPE is more robust than GPT3.5 in synthesising visual elaborations. ViPE also exhibits an understanding of figurative expressions comparable to human experts, provid