Autonomous medical robots hold promise to improve patient outcomes, reduce provider workload, democratize access to care, and enable superhuman precision. However, autonomous medical robotics has been limited by a fundamental data problem: existing medical robotic datasets are small, single-embodiment, and rarely shared openly, restricting the development of foundation models that the field needs to advance. We introduce Open-H-Embodiment, the largest open dataset of medical robotic video with synchronized kinematics to date, spanning more than 50 institutions and multiple robotic platforms including the CMR Versius, Intuitive Surgical's da Vinci, da Vinci Research Kit (dVRK), Rob Surgical BiTrack, Virtual Incision's MIRA, Moon Surgical Maestro, and a variety of custom systems, spanning surgical manipulation, robotic ultrasound, and endoscopy procedures. We demonstrate the research enabled by this dataset through two foundation models. GR00T-H is the first open foundation vision-language-action model for medical robotics, which is the only evaluated model to achieve full end-to-end task completion on a structured suturing benchmark (25% of trials vs. 0% for all others) and achieves
Let $B_1$ be the unit disk in ${\mathbb R}^2$. We consider the harmonic map equation $$ -Δu=| abla u|^2u,$$ subject to the Dirichlet boundary condition $ u(e^{iθ})=(R\cosθ,R\sinθ,\sqrt{1-R^2}):=g_R$, where $0<R<1$ and $u: B_1\to {\mathbb S}^2$ is understood in the weak harmonic-map sense. In 1983, Brezis and Coron proved the existence of two explicit solutions of this nonlinear Dirichlet problem and showed that they are the unique minimizers in their respective relative homotopy classes. In this paper, we resolve a long-standing open question originally posed in their work, later posed as Open Problem 3.1 in Brezis Favorite Open Problems List. Specifically, we prove that these two explicit maps are the only weak harmonic maps with boundary trace $g_{R}$, thereby providing a definitive affirmative answer to Brezis open problem. The proof is based on a boundary rigidity argument. An auxiliary potential $X$ associated with $u$, the Pohozaev identity for the Hopf differential, and the planar isoperimetric inequality imply $$|u_r|\equiv R, \qquad u_r\cdot u_θ\equiv0 \qquad\text{on }\partial B_1. $$ Thus the Hopf differential vanishes on the boundary and hence, by holomorphicity, o
The application of large language models (LLMs) in the medical field has garnered significant attention, yet their reasoning capabilities in more specialized domains like anesthesiology remain underexplored. To bridge this gap, we introduce AnesSuite, the first comprehensive dataset suite specifically designed for anesthesiology reasoning in LLMs. The suite features AnesBench, an evaluation benchmark tailored to assess anesthesiology-related reasoning across three levels: factual retrieval (System 1), hybrid reasoning (System 1.x), and complex decision-making (System 2). Alongside this benchmark, the suite includes three training datasets that provide an infrastructure for continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning with verifiable rewards (RLVR). Leveraging this suite, we develop Morpheus, the first baseline model collection for anesthesiology reasoning. Despite undergoing limited training with SFT and group relative policy optimization (GRPO), Morpheus not only achieves substantial improvements in anesthesiology that rival larger-scale models, but also demonstrates enhanced reasoning capabilities across general medical and broad-domain b
Gaia mission offers opportunities to search for compact binaries not involved in binary interactions (hereafter inert compact binaries), and results in the discoveries of binaries containing one black hole (BH) or one neutron star (NS), called "Gaia BHs" and "Gaia NSs", respectively. We have assessed if Gaia BHs and NSs can be formed in open clusters through dynamical interactions. In order to obtain a large number of inert compact binaries similar to Gaia BHs and NSs, we have performed gravitational $N$-body simulations for a large number of open clusters whose total mass is $1.2 \times 10^8 M_\odot$. These clusters have various masses, metallicities, densities, and binary fractions. We have found that open clusters form Gaia BHs ($10^{-6}$-$10^{-5} M_\odot^{-1}$) much more efficiently than Gaia NSs ($\lesssim 10^{-7} M_\odot^{-1}$) for any cluster parameters. This is quite inconsistent with observational results, because the reported numbers of Gaia BHs and NSs are $3$ and $21$, respectively. Additionally, we have switched off NS natal kicks for $10^4$ open clusters each weighing $10^3 M_\odot$ in order to retain a large number of NSs in open clusters. Then, open clusters form in
Open effective field theories provide a systematic framework for describing systems coupled to an environment, where dissipation, noise, and modified conservation laws naturally arise. Working within the Schwinger-Keldysh formalism, we examine open extensions of three well-studied theories: the superfluid, Maxwell theory, and Einstein gravity. In gauge and gravitational theories, open terms that break advanced symmetries while preserving physical ones are not automatically consistent; they are allowed only if they lead to deformed identities among the equations of motion. We explicitly construct such a term in open gravity and show that it leads to a consistent deformation of the diffeomorphism identities.
We introduce open-sci-ref, a family of dense transformer models trained as research baselines across multiple model (0.13B to 1.7B parameters) and token scales (up to 1T) on 8 recent open reference datasets. Evaluating the models on various standardized benchmarks, our training runs set establishes reference points that enable researchers to assess the sanity and quality of alternative training approaches across scales and datasets. Intermediate checkpoints allow comparison and studying of the training dynamics. The established reference baselines allow training procedures to be compared through their scaling trends, aligning them on a common compute axis. Comparison of open reference datasets reveals that training on NemoTron-CC HQ consistently outperforms other reference datasets, followed by DCLM-baseline and FineWeb-Edu. In addition to intermediate training checkpoints, the release includes logs, code, and downstream evaluations to simplify reproduction, standardize comparison, and facilitate future research.
Fully open multimodal large language models (MLLMs) currently lag behind proprietary counterparts, primarily due to a significant gap in data quality for supervised fine-tuning (SFT). Existing open-source datasets are often plagued by widespread noise and a critical deficit in complex reasoning data, such as Chain-of-Thought (CoT), which hinders the development of advanced model capabilities. Addressing these challenges, our work makes three primary contributions. First, we introduce Honey-Data-15M, a new SFT dataset comprising approximately 15 million QA pairs, processed through multiple cleaning techniques and enhanced with a novel dual-level (short and long) CoT enrichment strategy. Second, we introduce HoneyPipe, the data curation pipeline, and its underlying framework DataStudio, providing the community with a transparent and adaptable methodology for data curation that moves beyond static dataset releases. Finally, to validate our dataset and pipeline, we train Bee-8B, an 8B model on Honey-Data-15M. Experiments show that Bee-8B establishes a new state-of-the-art (SOTA) for fully open MLLMs, achieving performance that is competitive with, and in some cases surpasses, recent se
This paper reviews research literature on Diamond Open Access (DOA) journals - sometimes also called Platinum Open Access - that was produced after this journal segment started to become a priority in European research policy around 2020. It contextualizes the current science policy debate, critically examines different understandings of DOA, and reviews studies on the role of such journals in scholarly communication. Most existing research consists of quantitative studies focusing on aspects such as the number of DOA journals, their publication output, the diversity of the landscape in terms of subject areas, languages, publishing entities, indexing in major databases, awareness and perception among scholars, cost analyses, as well as insights into the internal operations of DOA journals. The review shows that research on DOA journals is partly influenced by the science policy discourse in at least two ways: first, through the normativity inherent in that discourse, and second, through the temporality of policy-driven research of practical relevance, which leaves important aspects of the phenomenon understudied. Moreover, research on the DOA journal landscape has implications beyo
In this paper, we prove that the open and closed strings are $O(D,D)$ equivalent. The equivalence requires an AdS geometry near the boundaries. The $O(D,D)$ invariance is introduced into the Polyakov action by the Tseytlin's action. Traditionally, there exist disconnected open-open or closed-closed configurations in the solution space of the Tseytlin's action. The open-closed configuration is ruled out by the mixed terms of the dual fields. We show that, under some very general guidances, the dual fields are consistently decoupled if and only if the near horizon geometry is $AdS_5$. We then have open-closed and closed-open configurations in different limits of the distances of the $D3$-brane pairs. Inherited from the definition of the theory, these four configurations are of course related to each other by $O(D,D)$ transformations. We therefore conclude that both the open/closed relation and open/closed duality can be derived from $O(D,D)$ symmetries. We then demonstrate the open/closed relation does connect commutative open and closed strings. By analyzing the couplings of the configurations, the low energy effective limits of our results consequently predicts the AdS/CFT correspo
Machine learning (ML) offers a powerful path toward discovering sustainable polymer materials, but progress has been limited by the lack of large, high-quality, and openly accessible polymer datasets. The Open Polymer Challenge (OPC) addresses this gap by releasing the first community-developed benchmark for polymer informatics, featuring a dataset with 10K polymers and 5 properties: thermal conductivity, radius of gyration, density, fractional free volume, and glass transition temperature. The challenge centers on multi-task polymer property prediction, a core step in virtual screening pipelines for materials discovery. Participants developed models under realistic constraints that include small data, label imbalance, and heterogeneous simulation sources, using techniques such as feature-based augmentation, transfer learning, self-supervised pretraining, and targeted ensemble strategies. The competition also revealed important lessons about data preparation, distribution shifts, and cross-group simulation consistency, informing best practices for future large-scale polymer datasets. The resulting models, analysis, and released data create a new foundation for molecular AI in polym
Medical large language models (LLMs) have gained popularity recently due to their significant practical utility. However, most existing research focuses on general medicine, and there is a need for in-depth study of LLMs in specific fields like anesthesiology. To fill the gap, we introduce Hypnos, a Chinese Anesthesia model built upon existing LLMs, e.g., Llama. Hypnos' contributions have three aspects: 1) The data, such as utilizing Self-Instruct, acquired from current LLMs likely includes inaccuracies. Hypnos implements a cross-filtering strategy to improve the data quality. This strategy involves using one LLM to assess the quality of the generated data from another LLM and filtering out the data with low quality. 2) Hypnos employs a general-to-specific training strategy that starts by fine-tuning LLMs using the general medicine data and subsequently improving the fine-tuned LLMs using data specifically from Anesthesiology. The general medical data supplement the medical expertise in Anesthesiology and enhance the effectiveness of Hypnos' generation. 3) We introduce a standardized benchmark for evaluating medical LLM in Anesthesiology. Our benchmark includes both publicly availa
In this work, I collect and discuss a series of open questions in one-dimensional geometric optimization in Euclidean spaces. The focus is on two classes of problems: maximal distance minimizers and Steiner trees. Maximal distance minimizers concern finding a connected set of minimal length whose closed $r$-neighborhood covers a given compact set, whereas Steiner trees aim to find a minimal-length set connecting a prescribed set of points. For both problems, I briefly summarize known results and highlight the remaining open questions. While some questions can be approached with elementary methods, others remain highly challenging.
This paper examines the state of Open Data in Latvia at the middle of 2014. The study is divided into two parts: (i) a survey of open data situation and (ii) an overview of available open data sets. The first part examines the general open data climate in Latvia according to the guidelines of the OKFN Open Data Index making the results comparable to those of other participants of this index. The second part examines datasets made available on the Latvia Open Data community catalogue, the only open data catalogue available in Latvia at the moment. We conclude that Latvia public sector open data mostly fulfil the basic criteria (e.g., data is available) of the Open Data Index but fail on more advanced criteria: the majority of data considered in the study are not published in machine-readable form, are not available for bulk download and none of the data sources have open license statements.
This text is a short introduction to the physics of driven-dissipative many-body systems, focusing on a few selected topics. Beyond its more ``historical'' interest in the study of atomic physics and quantum optics, presently the modeling and studying dissipative phenomena in open quantum systems is pivotal to understanding quantum hardware platforms. While the lack of a thermodynamic potential for these out-of-equilibrium open systems makes it theoretically challenging to investigate their physics, at the same time it allows going beyond the thermodynamic paradigms and investigating new and exotic phenomena. We will focus on one of the simplest, yet most effective, descriptions of open quantum systems, namely the (Gorini-Kossakowski-Sudarshan-) Lindblad master equation. This phenomenological approach describes quantum systems that weakly interact with their surrounding environment. Although many of the results derived below will apply to any quantum system, we will focus in particular on bosonic/spin systems.
The fourth industrial revolution promotes the integration of Information Technology (IT) and strategic resources. New IT demands and uses have been leading to changes in business processes and corporate governance. Lately, the financial industry has adopted a new integrated banking model known as Open Banking (OB) and the advent of cryptocurrencies has led to the Digital Economy (DE) materialization. Considering these facts, this paper expects to point out through literature review some IT enabling factors that allow the conception of a new industry design (or governance) specifically in the financial industry illustrated by the cases of the Open Banking and Digital Economy. This paper is structured mostly on literature review, accompanied by results, discussions, and finally, conclusions are presented. It was found five potential enabling factors. Keywords: Digital Economy, Information Technology (IT), Open Banking.
Social coding platforms have revolutionized collaboration in software development, leading to using software bots for streamlining operations. However, The presence of open-source software (OSS) bots gives rise to problems including impersonation, spamming, bias, and security risks. Identifying bot accounts and behavior is a challenging task in the OSS project. This research aims to investigate bots' behavior in open-source software projects and identify bot accounts with maximum possible accuracy. Our team gathered a dataset of 19,779 accounts that meet standardized criteria to enable future research on bots in open-source projects. We follow a rigorous workflow to ensure that the data we collect is accurate, generalizable, scalable, and up-to-date. We've identified four types of bot accounts in open-source software projects by analyzing their behavior across 17 features in 5 dimensions. Our team created BotHawk, a highly effective model for detecting bots in open-source software projects. It outperforms other models, achieving an AUC of 0.947 and an F1-score of 0.89. BotHawk can detect a wider variety of bots, including CI/CD and scanning bots. Furthermore, we find that the numbe
Open generative models are vitally important for the community, allowing for fine-tunes and serving as baselines when presenting new models. However, most current text-to-audio models are private and not accessible for artists and researchers to build upon. Here we describe the architecture and training process of a new open-weights text-to-audio model trained with Creative Commons data. Our evaluation shows that the model's performance is competitive with the state-of-the-art across various metrics. Notably, the reported FDopenl3 results (measuring the realism of the generations) showcase its potential for high-quality stereo sound synthesis at 44.1kHz.
Deploying robots in open-ended unstructured environments such as homes has been a long-standing research problem. However, robots are often studied only in closed-off lab settings, and prior mobile manipulation work is restricted to pick-move-place, which is arguably just the tip of the iceberg in this area. In this paper, we introduce Open-World Mobile Manipulation System, a full-stack approach to tackle realistic articulated object operation, e.g. real-world doors, cabinets, drawers, and refrigerators in open-ended unstructured environments. The robot utilizes an adaptive learning framework to initially learns from a small set of data through behavior cloning, followed by learning from online practice on novel objects that fall outside the training distribution. We also develop a low-cost mobile manipulation hardware platform capable of safe and autonomous online adaptation in unstructured environments with a cost of around 20,000 USD. In our experiments we utilize 20 articulate objects across 4 buildings in the CMU campus. With less than an hour of online learning for each object, the system is able to increase success rate from 50% of BC pre-training to 95% using online adaptat
This chapter addresses emergent ethical issues in producing, using, curating, and providing services for open data. Our goal is to provide an introduction to how ethical topics in open data manifest in practical dilemmas for scholarly communications and some approaches to understanding and working through them. We begin with a brief overview of what can be thought of as three basic theories of ethics that intersect with dilemmas in openness, accountability, transparency, and fairness in data: Virtue, Consequential, and Non-consequential ethics. We then map these kinds of ethics to the practical questions that arise in provisioning infrastructures, providing services, and supporting sustainable research in science and scholarship that depends upon open access to data. Throughout, we attempt to offer concrete examples of potential ethical dilemmas facing scholarly communication with respect to open data, and try to make clear what kinds of ethical positions are helpful to practitioners. In doing so, we hope to both clarify the ethical questions facing librarians doing practical work to support open data access, as well as situate current debates in the field with respect to these thr
Testing the aerodynamics of micro- and nano-UAVs without actually flying is highly challenging. To address this issue, we introduce Open Gimbal, a specially designed 3 Degrees of Freedom platform that caters to the unique requirements of micro- and nano-UAVs. This platform allows for unrestricted and free rotational motion, enabling comprehensive experimentation and evaluation of these UAVs. Our approach focuses on simplicity and accessibility. We developed an open-source, 3D printable electro-mechanical design that has minimal size and low complexity. This design facilitates easy replication and customization, making it widely accessible to researchers and developers. Addressing the challenges of sensing flight dynamics at a small scale, we have devised an integrated wireless batteryless sensor subsystem. Our innovative solution eliminates the need for complex wiring and instead uses wireless power transfer for sensor data reception. To validate the effectiveness of open gimbal, we thoroughly evaluate and test its communication link and sensing performance using a typical nano-quadrotor. Through comprehensive testing, we verify the reliability and accuracy of open gimbal in real-w