Training Mixture-of-Experts (MoE) models introduces sparse and highly imbalanced all-to-all communication that dominates iteration time. Conventional load-balancing methods fail to exploit the deterministic topology of Rail architectures, leaving multi-NIC bandwidth underutilized. We present RailS, a distributed load-balancing framework that minimizes all-to-all completion time in MoE training. RailS leverages the Rail topology's symmetry to prove that uniform sending ensures uniform receiving, transforming global coordination into local scheduling. Each node independently executes a Longest Processing Time First (LPT) spraying scheduler to proactively balance traffic using local information. RailS activates N parallel rails for fine-grained, topology-aware multipath transmission. Across synthetic and real-world MoE workloads, RailS improves bus bandwidth by 20%--78% and reduces completion time by 17%--78%. For Mixtral workloads, it shortens iteration time by 18%--40% and achieves near-optimal load balance, fully exploiting architectural parallelism in distributed training.
Rail-optimized network fabrics have become the de facto datacenter scale-out fabric for large-scale ML training. However, the use of high-radix electrical switches to provide all-to-all connectivity in rails imposes massive power and cost. We propose a rethinking of the rail abstraction by retaining its communication semantics, but realizing it using optical circuit switches. The key challenge is that optical switches support one-to-one connectivity at a time, limiting the fan-out of traffic in ML workloads using hybrid parallelisms. We overcome this through \emph{parallelism-driven rail reconfiguration}, which exploits the non-overlapping communication phases of different parallelism dimensions. This time-multiplexes a single set of physical ports across circuit configurations tailored to each phase within a training iteration. We design and implement Opus, a control plane that orchestrates this in-job reconfiguration of photonic rails at parallelism phase boundaries, and evaluate it on a physical OCS testbed, the Perlmutter supercomputer, and in simulation at up to 2,048 GPUs. Our results show that photonic rails can achieve over $23\times$ network power reduction and $4\times$ c
Large Language Models (LLMs) like GPT-3.5-Turbo are increasingly used to assist software development, yet they often produce incomplete code or incorrect imports, especially when lacking access to external or project-specific documentation. We introduce RAILS (Retrieval-Augmented Intelligence for Learning Software Development), a framework that augments LLM prompts with semantically retrieved context from curated Java resources using FAISS and OpenAI embeddings. RAILS incorporates an iterative validation loop guided by compiler feedback to refine suggestions. We evaluated RAILS on 78 real-world Java import error cases spanning standard libraries, GUI APIs, external tools, and custom utilities. Despite using the same LLM, RAILS outperforms baseline prompting by preserving intent, avoiding hallucinations, and surfacing correct imports even when libraries are unavailable locally. Future work will integrate symbolic filtering via PostgreSQL and extend support to other languages and IDEs.
NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational systems. Guardrails (or rails for short) are a specific way of controlling the output of an LLM, such as not talking about topics considered harmful, following a predefined dialogue path, using a particular language style, and more. There are several mechanisms that allow LLM providers and developers to add guardrails that are embedded into a specific model at training, e.g. using model alignment. Differently, using a runtime inspired from dialogue management, NeMo Guardrails allows developers to add programmable rails to LLM applications - these are user-defined, independent of the underlying LLM, and interpretable. Our initial results show that the proposed approach can be used with several LLM providers to develop controllable and safe LLM applications using programmable rails.
The emergence of the fifth generation (5G) technology has transformed mobile networks into multi-service environments, necessitating efficient network slicing to meet diverse Service Level Agreements (SLAs). SLA decomposition across multiple network domains, each potentially managed by different service providers, poses a significant challenge due to limited visibility into real-time underlying domain conditions. This paper introduces Risk-Aware Iterated Local Search (RAILS), a novel risk model-driven meta-heuristic framework designed to jointly address SLA decomposition and service provider selection in multi-domain networks. By integrating online risk modeling with iterated local search principles, RAILS effectively navigates the complex optimization landscape, utilizing historical feedback from domain controllers. We formulate the joint problem as a Mixed-Integer Nonlinear Programming (MINLP) problem and prove its NP-hardness. Extensive simulations demonstrate that RAILS achieves near-optimal performance, offering an efficient, real-time solution for adaptive SLA management in modern multi-domain networks.
Agentic payment systems extend delegated action to financial transfers, but scaling them on stablecoin rails in regulated settings requires safeguards that remain effective when humans are not continuously in the loop. We present a compliance-aware architecture that combines x402-style, signature-based payment authorisation and relayed execution with programmable compliance embedded as an on-chain guardrail via a policy wrapper and policy manager coordinating modular checks. By enforcing compliance at the point of execution, rather than as a separate off-chain workflow, the approach preserves low-friction settlement when conditions are satisfied, records transaction-linked on-chain attestations, and supports structured resolution when requirements are pending.
Sleep staging models often degrade when deployed on patients with unseen physiology or recording conditions. We propose a streaming, source-free test-time adaptation (TTA) recipe that combines entropy minimization (Tent) with Batch-Norm statistic refresh and two safety rails: an entropy gate to pause adaptation on uncertain windows and an EMA-based reset to reel back drift. On Sleep-EDF Expanded, using single-lead EEG (Fpz-Cz, 100 Hz, 30s epochs; R&K to AASM mapping), we show consistent gains over a frozen baseline at seconds-level latency and minimal memory, reporting per-stage metrics and Cohen's k. The method is model-agnostic, requires no source data or patient calibration, and is practical for on-device or bedside use.
Rail-optimized network fabrics have become the de facto datacenter scale-out fabric for large-scale ML training. However, the use of high-radix electrical switches to provide all-to-all connectivity in rails imposes massive power, cost, and complexity overheads. We propose a rethinking of the rail abstraction by retaining its communication semantics, but realizing it using optical circuit switches. The key challenge is that optical switches support only one-to-one connectivity at a time, limiting the fan-out of traffic in ML workloads using hybrid parallelisms. We introduce parallelism-driven rail reconfiguration as a solution that leverages the sequential ordering between traffic from different parallelisms. We design a control plane, Opus, to enable time-multiplexed emulation of electrical rail switches using optical switches. More broadly, our work discusses a new research agenda: datacenter fabrics that co-evolve with the model parallelism dimensions within each job, as opposed to the prevailing mindset of reconfiguring networks before a job begins.
We propose Guardian-FC, a novel two-layer framework for privacy preserving federated computing that unifies safety enforcement across diverse privacy preserving mechanisms, including cryptographic back-ends like fully homomorphic encryption (FHE) and multiparty computation (MPC), as well as statistical techniques such as differential privacy (DP). Guardian-FC decouples guard-rails from privacy mechanisms by executing plug-ins (modular computation units), written in a backend-neutral, domain-specific language (DSL) designed specifically for federated computing workflows and interchangeable Execution Providers (EPs), which implement DSL operations for various privacy back-ends. An Agentic-AI control plane enforces a finite-state safety loop through signed telemetry and commands, ensuring consistent risk management and auditability. The manifest-centric design supports fail-fast job admission and seamless extensibility to new privacy back-ends. We present qualitative scenarios illustrating backend-agnostic safety and a formal model foundation for verification. Finally, we outline a research agenda inviting the community to advance adaptive guard-rail tuning, multi-backend composition,
The surface tension of partially wetting droplets deforms soft substrates. These deformations are usually localized to a narrow region near the contact line, forming a so-called `elastocapillary ridge.' When a droplet slides along a substrate, the movement of the elastocapillary ridge dissipates energy in the substrate and slows the droplet down. Previous studies have analyzed isotropically spreading droplets and found that the advancing contact line `surfs' the elastocapillary ridge, with a velocity determined by a local balance of capillary forces and bulk rheology. Here, we experimentally explore the dynamics of a droplet sliding across soft substrates. At low velocities, the contact line is nearly circular, and dissipation increases logarithmically with speed. At higher droplet velocities, the contact line adopts a bullet-like shape, and the dissipation levels off. At the same time, droplets shed a pair of `elastocapillary rails' that fade away slowly behind it. These results suggest that droplets favor sliding along a stationary ridge over surfing atop a translating one.
Adversarial attacks against deep neural networks (DNNs) are continuously evolving, requiring increasingly powerful defense strategies. We develop a novel adversarial defense framework inspired by the adaptive immune system: the Robust Adversarial Immune-inspired Learning System (RAILS). Initializing a population of exemplars that is balanced across classes, RAILS starts from a uniform label distribution that encourages diversity and uses an evolutionary optimization process to adaptively adjust the predictive label distribution in a manner that emulates the way the natural immune system recognizes novel pathogens. RAILS' evolutionary optimization process explicitly captures the tradeoff between robustness (diversity) and accuracy (specificity) of the network, and represents a new immune-inspired perspective on adversarial learning. The benefits of RAILS are empirically demonstrated under eight types of adversarial attacks on a DNN adversarial image classifier for several benchmark datasets, including: MNIST; SVHN; CIFAR-10; and CIFAR-10. We find that PGD is the most damaging attack strategy and that for this attack RAILS is significantly more robust than other methods, achieving im
Magnetic skyrmions are promising candidates as information carriers in spintronic devices. The transport of individual skyrmions in a fast and controlled way is a key issue in this field. Here we introduce a novel platform for accelerating, guiding and compressing skyrmions along predefined paths. The guiding mechanism is based on two parallel defect-lines (rails), one attractive and the other repulsive. Numerical simulations, using parameters from state-of-the-art experiments, show that the speed of the skyrmions along the rails is increased up to a factor of ten with respect to the non-defect case whereas the distance between rails can be as small as the initial radius of the skyrmions. In this way, the flux of information that can be coded and transported with magnetic skyrmions could be significantly increased.
In robotic-assisted partial nephrectomy, surgeons remove a part of a kidney often due to the presence of a mass. A drop-in ultrasound probe paired to a surgical robot is deployed to execute multiple swipes over the kidney surface to localise the mass and define the margins of resection. This sub-task is challenging and must be performed by a highly skilled surgeon. Automating this sub-task may reduce cognitive load for the surgeon and improve patient outcomes. The overall goal of this work is to autonomously move the ultrasound probe on the surface of the kidney taking advantage of the use of the Pneumatically Attachable Flexible (PAF) rail system, a soft robotic device used for organ scanning and repositioning. First, we integrate a shape-sensing optical fibre into the PAF rail system to evaluate the curvature of target organs in robotic-assisted laparoscopic surgery. Then, we investigate the impact of the stiffness of the material of the PAF rail on the curvature sensing accuracy, considering that soft targets are present in the surgical field. Finally, we use shape sensing to plan the trajectory of the da Vinci surgical robot paired with a drop-in ultrasound probe and autonomous
We learn an interactive vision-based driving policy from pre-recorded driving logs via a model-based approach. A forward model of the world supervises a driving policy that predicts the outcome of any potential driving trajectory. To support learning from pre-recorded logs, we assume that the world is on rails, meaning neither the agent nor its actions influence the environment. This assumption greatly simplifies the learning problem, factorizing the dynamics into a nonreactive world model and a low-dimensional and compact forward model of the ego-vehicle. Our approach computes action-values for each training trajectory using a tabular dynamic-programming evaluation of the Bellman equations; these action-values in turn supervise the final vision-based driving policy. Despite the world-on-rails assumption, the final driving policy acts well in a dynamic and reactive world. At the time of writing, our method ranks first on the CARLA leaderboard, attaining a 25% higher driving score while using 40 times less data. Our method is also an order of magnitude more sample-efficient than state-of-the-art model-free reinforcement learning techniques on navigational tasks in the ProcGen benchm
One common feature of a vehicle, an ant and a kinesin motor is that they all convert chemical energy, derived from fuel or food, into mechanical energy required for their forward movement; such objects have been modelled in recent years as {\it self-driven} ``particles''. Cytoskeletal filaments, e.g., microtubules, form a ``rail'' network for intra-cellular transport of vesicular cargo by molecular motors like, for example, kinesins. Similarly, ants move along trails while vehicles move along lanes. Therefore, the traffic of vehicles and organisms as well as that of molecular motors can be modelled as systems of interacting self-driven particles; these are of current interest in non-equilibrium statistical mechanics. In this paper we point out the common features of these model systems and emphasize the crucial differences in their physical properties.
Web development is currently driven by model-view-controller (MVC) frameworks. How has content management adapted to this scenario? This paper reviews content management features in Ruby on Rails framework and its most popular plug-ins. These features are distributed among the different layers of the MVC architecture.
Any quantum state of the radiation field, sliced in small non-overlapping space-time bins is a collection of single-rail qubits, each spanning the vacuum and single-photon Fock state of a mode. Quantum logic on these qubits would enable arbitrary measurements on information-bearing light, but is hard due to the lack of strong nonlinearities. With unentangled ancilla single-rail qubits, an $8$-port interferometer and photon detection, we show any single-rail qubit measurement in the $XY$ Bloch plane is realizable with success probability $147/256$, which beats the prior-known $1/2$ limit.
We estimate the temperature distribution in the rails of an electromagnetic rail gun (EMG) due to the confinement of the current in a narrow surface layer resulting from the skin effect. In order to obtain analytic results, we assume a simple geometry for the rails, an electromagnetic skin effect boundary edge that propagates with the accelerating armature, and a current carrying channel controlled by magnetic field diffusion into the rails. We compute the temperature distribution in the rails at the time that the armature leaves the rails. For the range of exit velocities, from 1500 m/s to 5000 m/s, we find the highest temperatures are near the gun breech. After a single gun firing, the temperature reaches the melting temperature of the metal rails in a layer of finite thickness near the surface of the rails, for rails made of copper or tantalum. We plot the thickness of the melt layer as a function of position along the rails. In all cases, the thickness of the melt layer increases with gun velocity, making damage to the gun rails more likely at higher velocity. We also calculate the efficiency of the EMG as a function of gun velocity and find that the efficiency increases with i
Vibrational modes of trapped ions have traditionally served as quantum buses to mediate internal qubits. However, with recent advances in quantum control, it has become possible to use these vibrational modes directly as quantum computational resources, such as bosonic qubits. Here, we propose a dual-rail encoding scheme in which a dual-rail qubit is encoded by two vibrational modes that share a single phonon. We present the preparation, measurement, and implementation of single- and two-qubit gates, enabling universal quantum computation. The dual-rail qubit system offers scalability and all-to-all connectivity. Moreover, we extend the dual-rail qubit system to a logical internal qubit--dual-rail qubit hybrid system by incorporating internal qubits into the dual-rail qubit system as another type of logical qubit. The hybrid system nearly doubles the number of available logical qubits compared to conventional trapped-ion quantum computers while maintaining all-to-all connectivity. Additionally, we propose a method for implementing multi-qubit controlled gates and discuss potential applications that can leverage the advantages of the hybrid system. Our scheme provides a practical fr
Non-destructive evaluation (NDE) of rail tracks is crucial to ensure the safety and reliability of rail transportation systems. In this work, we present a quantitative study using various signal processing methods to identify defects in rail structures. A diffuse field configuration was employed at few dozens of kiloHertz, where the emitter and receiver were remotely located, and wave energy propagated via multiple reflections within the medium. A reference database is first constructed by acquiring measurements at different rail positions and different torque levels (up to 50 N.m). The defect is then identified by comparing its signature to those stacked in the database. First, the destretching technique, based on Coda Wave Interferometry (CWI), is applied to correct for temperature-induced velocity variations. Then, the identification is performed using the Mean Square Error (MSE) metric and Orthogonal Matching Pursuit (OMP) technique. A comparative analysis of the both methods is conducted, focusing on their robustness and performance.