Despite rapid recent progress in the terminal capabilities of large language models, the training data strategies behind state-of-the-art terminal agents remain largely undisclosed. We address this gap through a systematic study of data engineering practices for terminal agents, making two key contributions: (1) Terminal-Task-Gen, a lightweight synthetic task generation pipeline that supports seed-based and skill-based task construction, and (2) a comprehensive analysis of data and training strategies, including filtering, curriculum learning, long context training, and scaling behavior. Our pipeline yields Terminal-Corpus, a large-scale open-source dataset for terminal tasks. Using this dataset, we train Nemotron-Terminal, a family of models initialized from Qwen3(8B, 14B, 32B) that achieve substantial gains on Terminal-Bench 2.0: Nemotron-Terminal-8B improves from 2.5% to 13.0% Nemotron-Terminal-14B improves from 4.0% to 20.2%, and Nemotron-Terminal-32B improves from 3.4% to 27.4%, matching the performance of significantly larger models. To accelerate research in this domain, we open-source our model checkpoints and most of our synthetic datasets at https://huggingface.co/collect
We introduce TerminalWorld, a scalable data engine that automatically reverse-engineers high-fidelity evaluation tasks from "in-the-wild" terminal recordings. Processing 80,870 terminal recordings, the engine yields a full benchmark of 1,530 validated tasks, spanning 18 real-world categories, ranging from short everyday operations to workflows exceeding 50 steps, and covering 1,280 unique commands. From these, we curate a Verified subset of 200 representative, manually reviewed tasks. Comprehensive benchmarking on TerminalWorld-Verified across eight frontier models and six agents reveals that current systems still struggle with authentic terminal workflows, achieving a maximum pass rate of only 62.5%. Moreover, TerminalWorld captures real-world terminal capabilities distinct from existing expert-curated benchmarks (e.g., Terminal-Bench), with only a weak correlation to their scores (Pearson r=0.20). The automated engine makes TerminalWorld authentic and scalable by construction, enabling it to evaluate agents in real-world terminal environments as developer practices evolve. Data and code are available at https://github.com/EuniAI/TerminalWorld.
Terminals provide a powerful interface for AI agents by exposing diverse tools for automating complex workflows, yet existing terminal-agent benchmarks largely focus on tasks grounded in text, code, and structured files. However, many real-world workflows require practitioners to work directly with audio and video files. Working with such multimedia files calls for terminal agents not only to understand multimedia content, but also to convert auditory and visual evidence across related files into appropriate actions. To evaluate terminal agents on multimedia-file tasks, we introduce MultiMedia-TerminalBench (MMTB), a benchmark of 105 tasks across 5 meta-categories where terminal agents directly operate with audio and video files. Alongside MMTB, we propose Terminus-MM, a multimedia harness that extends Terminus-KIRA with audio and video perception for terminal agents. Together, MMTB and Terminus-MM support a controlled study of multimedia terminal agents, revealing how different forms of multimedia access shape task outcomes and determine which evidence agents rely on to construct executable terminal workflows. MMTB media and metadata are released at https://huggingface.co/datasets
Training agentic models for terminal-based tasks critically depends on high-quality terminal trajectories that capture realistic long-horizon interactions across diverse domains. However, constructing such data at scale remains challenging due to two key requirements: \textbf{\emph{Executability}}, since each instance requires a suitable and often distinct Docker environment; and \textbf{\emph{Verifiability}}, because heterogeneous task outputs preclude unified, standardized verification. To address these challenges, we propose \textbf{TerminalTraj}, a scalable pipeline that (i) filters high-quality repositories to construct Dockerized execution environments, (ii) generates Docker-aligned task instances, and (iii) synthesizes agent trajectories with executable validation code. Using TerminalTraj, we curate 32K Docker images and generate 50,733 verified terminal trajectories across eight domains. Models trained on this data with the Qwen2.5-Coder backbone achieve consistent performance improvements on TerminalBench (TB), with gains of up to 20\% on TB~1.0 and 10\% on TB~2.0 over their respective backbones. Notably, \textbf{TerminalTraj-32B} achieves strong performance among models w
The Unix terminal, or just simply, the terminal, can be found being applied in almost every facet of computing. It is available across all major platforms and often integrated into other applications. Due to its ubiquity, even marginal improvements to the terminal have the potential to make massive improvements to productivity on a global scale. We believe that evolutionary improvements to the terminal, in its current incarnation as windowed terminal emulator, are possible and that developing a thorough understanding of issues that current terminal users face is fundamental to knowing how the terminal should evolve. In order to develop that understanding we have mined Unix and Linux Stack Exchange using a fully-reproducible method which was able to extract and categorize 91.0% of 1,489 terminal-related questions (from the full set of nearly 240,000 questions) without manual intervention. We present an analysis, to our knowledge the first of its kind, of windowed terminal-related questions posted over a 15-year period and viewed, in aggregate, approximately 40 million times. As expected, given its longevity, we find the terminal's many features being applied across a wide variety of
Analysis of data from randomized controlled trials in vulnerable populations requires special attention when assessing treatment effect by a score measuring, e.g., disease stage or activity together with onset of prevalent terminal events. In reality, it is impossible to disentangle a disease score from the terminal event, since the score is not clinically meaningful after this event. In this work, we propose to assess treatment interventions simultaneously on the terminal event and the disease score in the absence of a terminal event. Our proposal is based on a natural data-generating mechanism, respecting that a disease score does not exist beyond the terminal event. We use modern semi-parametric statistical methods to provide robust and efficient estimation of the risk of terminal event and expected disease score conditional on no terminal event at a pre-specified landmark time. We also use the simultaneous asymptotic behaviour of our estimators to develop a powerful closed testing procedure for confirmatory assessment of treatment effect on both onset of terminal event and level of disease score in the absence of a terminal event. A simulation study mimicking a large-scale outc
Terminal-using agents have quickly become the most popular downstream application of language models (LMs). Despite their prevalence, relatively little academic work has examined RL-based training of these models, likely due to difficult benchmarks, a lack of data, and a lack of simple baseline recipes. We present Tmax, the strongest open RL recipe for terminal agents to date, bringing open data recipes closer to the frontier. While simple, our recipe achieves 27\% on Terminal-Bench 2.0 with only 9B parameters, outperforming much larger models from prior work. Concretely, we generate data using a novel taxonomy, combining difficulty control, personas, and verifier diversification, which allows us to cheaply generate large amounts of terminal environments for RL and SFT training. We open-source our terminal dataset, which is over 2.5x larger than previously released terminal-agent datasets. We then train open-weight models using RL with our data, using a simple, outcome-only recipe. We release our data, models, and code as a strong baseline for future open academic work on terminal agents at https://github.com/hamishivi/tmax.
Given a connected graph $G$ and a terminal set $R \subseteq V(G)$, the Steiner tree problem (ST) asks for a tree that spans all of $R$ with at most $r$ vertices from $V(G)\backslash R$, for some integer $r\geq 0$. It is known from (Garey et al.,1977 ) that ST is NP-complete. A Steiner tree in which all terminal vertices are constrained to be leaves is called a terminal Steiner tree. Our study addresses the existence of a terminal Steiner tree, its complexity across various graph classes, black-box applications of the ST, and a fixed-parameter tractable (FPT) algorithm with respect to the number of terminals.
Mastering terminal environments requires language agents capable of multi-step planning, feedback-grounded execution, and dynamic state adaptation. However, training such agents is currently bottlenecked by a reliance on scraped external repositories, which limits domain diversity, environment controllability, and the targeting of specific capability deficits. We introduce LiteCoder-Terminal-Gen, a zero-dependency synthesis pipeline that autonomously generates executable and verifiable terminal training environments directly from domain specifications. Using this framework, we construct two large-scale resources: LiteCoder-Terminal-SFT, comprising 11,255 expert trajectories across 10 domains, and LiteCoder-Terminal-RL, featuring 602 verifiable environments for trajectory-level preference optimization. Supervised fine-tuning of Qwen-family models on our SFT dataset yields agents that significantly outperform their base counterparts. Notably, our 32B variant achieves 29.06%, 18.54%, and 34.00% pass@1 on Terminal Bench 1.0, 2.0, and Pro, respectively. Furthermore, applying Direct Multi-turn Preference Optimization (DMPO) on our RL environments yields additional performance gains. Thes
Environments are the bottleneck for self-improving agents. Current terminal benchmarks were built for evaluation, not training; reinforcement learning requires a scalable pipeline, not just a dataset. We introduce Endless Terminals, a fully autonomous pipeline that procedurally generates terminal-use tasks without human annotation. The pipeline has four stages: generating diverse task descriptions, building and validating containerized environments, producing completion tests, and filtering for solvability. From this pipeline we obtain 3255 tasks spanning file operations, log management, data processing, scripting, and database operations. We train agents using vanilla PPO with binary episode level rewards and a minimal interaction loop: no retrieval, multi-agent coordination, or specialized tools. Despite this simplicity, models trained on Endless Terminals show substantial gains: on our held-out dev set, Llama-3.2-3B improves from 4.0% to 18.2%, Qwen2.5-7B from 10.7% to 53.3%, and Qwen3-8B-openthinker-sft from 42.6% to 59.0%. These improvements transfer to human-curated benchmarks: models trained on Endless Terminals show substantial gains on held out human curated benchmarks: on
We consider a class of backward stochastic differential equations (BSDEs) with singular terminal condition and develop a numerical scheme to approximate their solution. To this end, we extend an asymptotic development of the BSDE solution known from the power case, which arises from optimal liquidation problems, to more general generators. This expansion allows to obtain a suitable approximation of the BSDE solution close to the terminal time. Using this as a terminal condition, we analyze the error of a backward Euler implicit scheme and detail its dependence on the terminal condition.
We analyze a class of multidimensional linear-quadratic stochastic control problems with random coefficients, motivated by multi-asset optimal trade execution. The problems feature non-diffusive controlled state dynamics and a terminal constraint that restricts the terminal state to a prescribed random linear subspace. We derive the associated Riccati backward stochastic differential equation (BSDE) and identify a suitable formalization of its singular terminal condition. Via a penalization approach, we establish existence of a minimal supersolution of the Riccati BSDE and use it to characterize both the value function and the optimal control. We analyze the asymptotic behavior of the supersolution near terminal time and discuss special cases where closed-form solutions can be obtained.
We study an optimal control problem for the heat equation with a prescribed terminal state. To circumvent the difficulty of enforcing a hard terminal constraint, we analyze a penalized formulation and prove that the corresponding optimal controls and terminal states converge to the exact constrained solution as the penalty parameter \(α\to \infty\). We establish explicit quantitative convergence estimates of order \(O(α^{-θ})\), including the sharp \(O(1/α)\) rate under stronger modal summability assumptions on the terminal mismatch. A finite-dimensional prototype is used to illustrate the underlying projection structure, while numerical illustrations are reported in a companion study.
We consider the problem of autonomous channel access (AutoCA), where a group of terminals tries to discover a communication strategy with an access point (AP) via a common wireless channel in a distributed fashion. Due to the irregular topology and the limited communication range of terminals, a practical challenge for AutoCA is the hidden terminal problem, which is notorious in wireless networks for deteriorating the throughput and delay performances. To meet the challenge, this paper presents a new multi-agent deep reinforcement learning paradigm, dubbed MADRL-HT, tailored for AutoCA in the presence of hidden terminals. MADRL-HT exploits topological insights and transforms the observation space of each terminal into a scalable form independent of the number of terminals. To compensate for the partial observability, we put forth a look-back mechanism such that the terminals can infer behaviors of their hidden terminals from the carrier sensed channel states as well as feedback from the AP. A window-based global reward function is proposed, whereby the terminals are instructed to maximize the system throughput while balancing the terminals' transmission opportunities over the cours
In this paper we prove the existence of Hölder continuous terminal embeddings of any desired $X \subseteq \mathbb{R}^d$ into $\mathbb{R}^{m}$ with $m=\mathcal{O}(\varepsilon^{-2}ω(S_X)^2)$, for arbitrarily small distortion $\varepsilon$, where $ω(S_X)$ denotes the Gaussian width of the unit secants of $X$. More specifically, when $X$ is a finite set we provide terminal embeddings that are locally $\frac{1}{2}$-Hölder almost everywhere, and when $X$ is infinite with positive reach we give terminal embeddings that are locally $\frac{1}{4}$-Hölder everywhere sufficiently close to $X$ (i.e., within all tubes around $X$ of radius less than $X$'s reach). When $X$ is a compact $d$-dimensional submanifold of $\mathbb{R}^N$, an application of our main results provides terminal embeddings into $\tilde{\mathcal{O}}(d)$-dimensional space that are locally Hölder everywhere sufficiently close to the manifold.
A new scenario for generating a secret key and two private keys among three Terminals in the presence of an external eavesdropper is considered. Terminals 1, 2 and 3 intend to share a common secret key concealed from the external eavesdropper (Terminal 4) and simultaneously, each of Terminals 1 and 2 intends to share a private key with Terminal 3 while keeping it concealed from each other and from Terminal 4. All four Terminals observe i.i.d. outputs of correlated sources and there is a public channel from Terminal 3 to Terminals 1 and 2. An inner bound of the "secret key-private keys capacity region" is derived and the single letter capacity regions are obtained for some special cases.
In this paper, we propose a distributed model predictive control (DMPC) scheme for linear time-invariant constrained systems which admit a separable structure. To exploit the merits of distributed computation algorithms, the stabilizing terminal controller, value function and invariant terminal set of the DMPC optimization problem need to respect the loosely coupled structure of the system. Although existing methods in the literature address this task, they typically decouple the synthesis of terminal controllers and value functions from the one of terminal sets. In addition, these approaches do not explicitly consider the effect of the current state of the system in the synthesis process. These limitations can lead the resulting DMPC scheme to poor performance since it may admit small or even empty terminal sets. Unlike other approaches, this paper presents a unified framework to encapsulate the synthesis of both the stabilizing terminal controller and invariant terminal set into the DMPC formulation. Conditions for Lyapunov stability and invariance are imposed in the synthesis problem in a way that allows the value function and invariant terminal set to admit the desired distribu
The purpose of this note is a wide generalization of the topological results of various classes of ideals of rings, semirings, and modules, endowed with Zariski topologies, to strongly irreducible ideals (endowed with Zariski topologies) of monoids, called terminal spaces. We show that terminal spaces are $T_0$, quasi-compact, and every nonempty irreducible closed subset has a unique generic point. We characterize arithmetic monoids in terms of terminal spaces. Finally, we provide necessary and sufficient conditions for the subspaces of maximal and prime ideals to be dense in the corresponding terminal spaces.
In Terminal Monitoring Set (TMS), the input is an undirected graph $G=(V,E)$, together with a collection $T$ of terminal pairs and the goal is to find a subset $S$ of minimum size that hits a shortest path between every pair of terminals. We show that this problem is W[2]-hard with respect to solution size. On the positive side, we show that TMS is fixed parameter tractable with respect to solution size plus distance to cluster, solution size plus neighborhood diversity, and feedback edge number. For the weighted version of the problem, we obtain a FPT algorithm with respect to vertex cover number, and for a relaxed version of the problem, we show that it is W[1]-hard with respect to solution size plus feedback vertex number.
We present a collection of results that imply that an endofunctor on a category has a terminal object obtainable as a countable limit of its terminal-coalgebra chain. This holds for finitary endofunctors preserving nonempty binary intersections on locally finitely presentable categories, assuming that the posets of strong quotients and subobjects of finitely presentable objects satisfy the descending chain condition. This allows one to adapt finiteness arguments that were originally advanced by Worrell concerning terminal coalgebras for finitary set functors. Examples include the categories of sets, posets, vector spaces, graphs, nominal sets, and presheaves on finite sets. Worrell also described, without proof, the terminal-coalgebra chain of the finite power-set functor. We provide a detailed proof following his ideas. We then turn to polynomial endofunctors on the categories of Hausdorff topological spaces and metric spaces. The Vietoris space of compact subsets of a given Hausdorff space yields an endofunctor $\mathscr{V}$ on the category of Hausdorff spaces. Vietoris polynomial endofunctors on that category are built from $\mathscr{V}$, the identity and constant functors by fo