The application of large language models (LLMs) in the medical field has garnered significant attention, yet their reasoning capabilities in more specialized domains like anesthesiology remain underexplored. To bridge this gap, we introduce AnesSuite, the first comprehensive dataset suite specifically designed for anesthesiology reasoning in LLMs. The suite features AnesBench, an evaluation benchmark tailored to assess anesthesiology-related reasoning across three levels: factual retrieval (System 1), hybrid reasoning (System 1.x), and complex decision-making (System 2). Alongside this benchmark, the suite includes three training datasets that provide an infrastructure for continued pre-training (CPT), supervised fine-tuning (SFT), and reinforcement learning with verifiable rewards (RLVR). Leveraging this suite, we develop Morpheus, the first baseline model collection for anesthesiology reasoning. Despite undergoing limited training with SFT and group relative policy optimization (GRPO), Morpheus not only achieves substantial improvements in anesthesiology that rival larger-scale models, but also demonstrates enhanced reasoning capabilities across general medical and broad-domain b
Medical large language models (LLMs) have gained popularity recently due to their significant practical utility. However, most existing research focuses on general medicine, and there is a need for in-depth study of LLMs in specific fields like anesthesiology. To fill the gap, we introduce Hypnos, a Chinese Anesthesia model built upon existing LLMs, e.g., Llama. Hypnos' contributions have three aspects: 1) The data, such as utilizing Self-Instruct, acquired from current LLMs likely includes inaccuracies. Hypnos implements a cross-filtering strategy to improve the data quality. This strategy involves using one LLM to assess the quality of the generated data from another LLM and filtering out the data with low quality. 2) Hypnos employs a general-to-specific training strategy that starts by fine-tuning LLMs using the general medicine data and subsequently improving the fine-tuned LLMs using data specifically from Anesthesiology. The general medical data supplement the medical expertise in Anesthesiology and enhance the effectiveness of Hypnos' generation. 3) We introduce a standardized benchmark for evaluating medical LLM in Anesthesiology. Our benchmark includes both publicly availa
This article introduces a metamodel for the Business Model Canvas (BMC) using the Unified Modelling Language (UML), together with a dedicated Domain-Specific Modelling Language (DSML) tool. Although the BMC is widely adopted by both practitioners and scholars, significant challenges remain in formally modelling business models, particularly with regard to explicit specification of inter-component relationships, while preserving the simplicity that characterises the BMC. Addressing this tension between modelling rigour and practical relevance, this research adopts a Design Science Research approach to formally specify relationships among BMC components and to strengthen their theoretical grounding through an adaptation of the V 4 framework. The proposed metamodel consolidates BMC relationships into three core types: supports, determines, and affects, providing explicit semantics while remaining accessible to end users through graphical tooling. The findings highlight that formally specifying relationships significantly improves the interpretability and consistency of BMC representations. The proposed metamodel and tool offer a rigorous yet usable foundation for developing DSML-based
Bounded Model Checking (BMC) is a widely used software verification technique. Despite its successes, the technique has several limiting factors, from state-space explosion to lack of completeness. Over the years, interval analysis has repeatedly been proposed as a partial solution to these limitations. In this work, we evaluate whether the computational cost of interval analysis yields significant enough improvements in BMC's performance to justify its use. In more detail, we quantify the benefits of interval analysis on two benchmarks: the Intel Core Power Management firmware and 9537 programs in the ReachSafety category of the International Competition on Software Verification. Our results show that interval analysis is essential in solving 203 unique benchmarks.
Finding software vulnerabilities in concurrent programs is a challenging task due to the size of the state-space exploration, as the number of interleavings grows exponentially with the number of program threads and statements. We propose and evaluate EBF (Ensembles of Bounded Model Checking with Fuzzing) -- a technique that combines Bounded Model Checking (BMC) and Gray-Box Fuzzing (GBF) to find software vulnerabilities in concurrent programs. Since there are no publicly-available GBF tools for concurrent code, we first propose OpenGBF -- a new open-source concurrency-aware gray-box fuzzer that explores different thread schedules by instrumenting the code under test with random delays. Then, we build an ensemble of a BMC tool and OpenGBF in the following way. On the one hand, when the BMC tool in the ensemble returns a counterexample, we use it as a seed for OpenGBF, thus increasing the likelihood of executing paths guarded by complex mathematical expressions. On the other hand, we aggregate the outcomes of the BMC and GBF tools in the ensemble using a decision matrix, thus improving the accuracy of EBF. We evaluate EBF against state-of-the-art pure BMC tools and show that it can
Bounded model checking (BMC) is a widely used technique for formal property verification (FPV), where the transition relation is repeatedly unrolled to increasing depths and encoded into Boolean satisfiability (SAT) queries. As the bound grows deeper, these SAT queries typically become more difficult to solve, posing scalability challenges. Howevefor, many FPV problems involve multiple copies of related circuits, creating opportunities to simplify the unrolled transition relation. Motivated by the functionally reduced and-inverter-graph (FRAIG) technique, we propose FRAIG-BMC, which incrementally identifies and merges functionally equivalent nodes during the unrolling process. By reducing redundancy, FRAIG-BMC improves the efficiency of SAT solving and accelerates property checking. Experiments demonstrate that FRAIG-BMC significantly speeds up BMC across a range of applications, including sequential equivalence checking, partial retention register detection, and information flow checking
Bounded model checking (BMC) and fuzzing techniques are among the most effective methods for detecting errors and security vulnerabilities in software. However, there are still shortcomings in detecting these errors due to the inability of existent methods to cover large areas in target code. We propose FuSeBMC v4, a test generator that synthesizes seeds with useful properties, that we refer to as smart seeds, to improve the performance of its hybrid fuzzer thereby achieving high C program coverage. FuSeBMC works by first analyzing and incrementally injecting goal labels into the given C program to guide BMC and Evolutionary Fuzzing engines. After that, the engines are employed for an initial period to produce the so-called smart seeds. Finally, the engines are run again, with these smart seeds as starting seeds, in an attempt to achieve maximum code coverage / find bugs. During both seed generation and normal running, coordination between the engines is aided by the Tracer subsystem. This subsystem carries out additional coverage analysis and updates a shared memory with information on goals covered so far. Furthermore, the Tracer evaluates test cases dynamically to convert cases
With the skyrocketing costs of GPUs and their virtual instances in the cloud, there is a significant desire to use CPUs for large language model (LLM) inference. KV cache update, often implemented as allocation, copying, and in-place strided update for each generated token, incurs significant overhead. As the sequence length increases, the allocation and copy overheads dominate the performance. Alternate approaches may allocate large KV tensors upfront to enable in-place updates, but these matrices (with zero-padded rows) cause redundant computations. In this work, we propose a new KV cache allocation mechanism called Balancing Memory and Compute (BMC). BMC allocates, once every r iterations, KV tensors with r redundant rows, allowing in-place update without copy overhead for those iterations, but at the expense of a small amount of redundant computation. Second, we make an interesting observation that the extra rows allocated in the KV tensors and the resulting redundant computation can be repurposed for Speculative Decoding (SD) that improves token generation efficiency. Last, BMC represents a spectrum of design points with different values of r. To identify the best-performing d
The training process of generative adversarial networks (GANs) is unstable and does not converge globally. In this paper, we examine the stability of GANs from the perspective of control theory and propose a universal higher-order noise-based controller called Brownian Motion Controller (BMC). Starting with the prototypical case of Dirac-GANs, we design a BMC to retrieve precisely the same but reachable optimal equilibrium. We theoretically prove that the training process of DiracGANs-BMC is globally exponential stable and derive bounds on the rate of convergence. Then we extend our BMC to normal GANs and provide implementation instructions on GANs-BMC. Our experiments show that our GANs-BMC effectively stabilizes GANs' training under StyleGANv2-ada frameworks with a faster rate of convergence, a smaller range of oscillation, and better performance in terms of FID score.
Satisfiability Modulo Theories (SMT) solvers have been successfully applied to solve many problems in formal verification such as bounded model checking (BMC) for many classes of systems from integrated circuits to cyber-physical systems. Typically, BMC is performed by checking satisfiability of a possibly long, but quantifier-free formula. However, BMC problems can naturally be encoded as quantified formulas over the number of BMC steps. In this approach, we then use decision procedures supporting quantifiers to check satisfiability of these quantified formulas. This approach has previously been applied to perform BMC using a Quantified Boolean Formula (QBF) encoding for purely discrete systems, and then discharges the QBF checks using QBF solvers. In this paper, we present a new quantified encoding of BMC for rectangular hybrid automata (RHA), which requires using more general logics due to the real (dense) time and real-valued state variables modeling continuous states. We have implemented a preliminary experimental prototype of the method using the HyST model transformation tool to generate the quantified BMC (QBMC) queries for the Z3 SMT solver. We describe experimental result
Bounded model checking (BMC) is an effective technique for hunting bugs by incrementally exploring the state space of a system. To reason about infinite traces through a finite structure and to ultimately obtain completeness, BMC incorporates loop conditions that revisit previously observed states. This paper focuses on developing loop conditions for BMC of HyperLTL- a temporal logic for hyperproperties that allows expressing important policies for security and consistency in concurrent systems, etc. Loop conditions for HyperLTL are more complicated than for LTL, as different traces may loop inconsistently in unrelated moments. Existing BMC approaches for HyperLTL only considered linear unrollings without any looping capability, which precludes both finding small infinite traces and obtaining a complete technique. We investigate loop conditions for HyperLTL BMC, where the HyperLTL formula can contain up to one quantifier alternation. We first present a general complete automata-based technique which is based on bounds of maximum unrollings. Then, we introduce alternative simulation-based algorithms that allow exploiting short loops effectively, generating SAT queries whose satisfia
Bayesian model comparison (BMC) offers a principled approach for assessing the relative merits of competing computational models and propagating uncertainty into model selection decisions. However, BMC is often intractable for the popular class of hierarchical models due to their high-dimensional nested parameter structure. To address this intractability, we propose a deep learning method for performing BMC on any set of hierarchical models which can be instantiated as probabilistic programs. Since our method enables amortized inference, it allows efficient re-estimation of posterior model probabilities and fast performance validation prior to any real-data application. In a series of extensive validation studies, we benchmark the performance of our method against the state-of-the-art bridge sampling method and demonstrate excellent amortized inference across all BMC settings. We then showcase our method by comparing four hierarchical evidence accumulation models that have previously been deemed intractable for BMC due to partly implicit likelihoods. Additionally, we demonstrate how transfer learning can be leveraged to enhance training efficiency. We provide reproducible code for
Bounded Model Checking (BMC) is a powerful technique for proving unsafety. However, finding deep counterexamples that require a large bound is challenging for BMC. On the other hand, acceleration techniques compute "shortcuts" that "compress" many execution steps into a single one. In this paper, we tightly integrate acceleration techniques into SMT-based bounded model checking. By adding suitable "shortcuts" on the fly, our approach can quickly detect deep counterexamples. Moreover, using so-called blocking clauses, our approach can prove safety of examples where BMC diverges. An empirical comparison with other state-of-the-art techniques shows that our approach is highly competitive for proving unsafety, and orthogonal to existing techniques for proving safety.
Resolving the exploration-exploitation trade-off remains a fundamental problem in the design and implementation of reinforcement learning (RL) algorithms. In this paper, we focus on model-free RL using the epsilon-greedy exploration policy, which despite its simplicity, remains one of the most frequently used forms of exploration. However, a key limitation of this policy is the specification of $\varepsilon$. In this paper, we provide a novel Bayesian perspective of $\varepsilon$ as a measure of the uniformity of the Q-value function. We introduce a closed-form Bayesian model update based on Bayesian model combination (BMC), based on this new perspective, which allows us to adapt $\varepsilon$ using experiences from the environment in constant time with monotone convergence guarantees. We demonstrate that our proposed algorithm, $\varepsilon$-\texttt{BMC}, efficiently balances exploration and exploitation on different problems, performing comparably or outperforming the best tuned fixed annealing schedules and an alternative data-dependent $\varepsilon$ adaptation scheme proposed in the literature.
In order to address the need for more capacity and coverage in the 5th generation (5G) of wireless networks, ultra-dense wireless networks are introduced which mainly consist of indoor small cells. This new architecture has paved the way for the advent of a new concept called Micro Operator. A micro operator is an entity that provides connections and local 5G services to the customers and relies on local frequency resources. We discuss business models of micro operators in a 5G coopetitive environment and develop a framework to indicate the business model canvas (BMC) of this new concept. Providing BMC for new businesses is a strategic approach to offer value to customers. In this research study, BMC and its elements are introduced and explained for 5G micro operators.
This paper investigates approaches to parallelizing Bounded Model Checking (BMC) for shared memory environments as well as for clusters of workstations. We present a generic framework for parallelized BMC named Tarmo. Our framework can be used with any incremental SAT encoding for BMC but for the results in this paper we use only the current state-of-the-art encoding for full PLTL. Using this encoding allows us to check both safety and liveness properties, contrary to an earlier work on distributing BMC that is limited to safety properties only. Despite our focus on BMC after it has been translated to SAT, existing distributed SAT solvers are not well suited for our application. This is because solving a BMC problem is not solving a set of independent SAT instances but rather involves solving multiple related SAT instances, encoded incrementally, where the satisfiability of each instance corresponds to the existence of a counterexample of a specific length. Our framework includes a generic architecture for a shared clause database that allows easy clause sharing between SAT solver threads solving various such instances. We present extensive experimental results obtained with multip
The vast number of interleavings that a concurrent program can have is typically identified as the root cause of the difficulty of automatic analysis of concurrent software. Weak memory is generally believed to make this problem even harder. We address both issues by modelling programs' executions with partial orders rather than the interleaving semantics (SC). We implemented a software analysis tool based on these ideas. It scales to programs of sufficient size to achieve first-time formal verification of non-trivial concurrent systems code over a wide range of models, including SC, Intel x86 and IBM Power.
Bounded model checking (BMC) is an efficient formal verification technique which allows for desired properties of a software system to be checked on bounded runs of an abstract model of the system. The properties are frequently described in some temporal logic and the system is modeled as a state transition system. In this paper we propose a novel counting logic, $\mathcal{L}_{C}$, to describe the temporal properties of client-server systems with an unbounded number of clients. We also propose two dimensional bounded model checking ($2D$-BMC) strategy that uses two distinguishable parameters, one for execution steps and another for the number of tokens in the net representing a client-server system, and these two evolve separately, which is different from the standard BMC techniques in the Petri Nets formalism. This $2D$-BMC strategy is implemented in a tool called DCModelChecker which leverages the $2D$-BMC technique with a state-of-the-art satisfiability modulo theories (SMT) solver Z3. The system is given as a Petri Net and properties specified using $\mathcal{L}_{C}$ are encoded into formulas that are checked by the solver. Our tool can also work on industrial benchmarks from t
Apart from the actual CPU, modern server motherboards contain other auxiliary components, for example voltage regulators for power management. Those are connected to the CPU and the separate Baseboard Management Controller (BMC) via the I2C-based PMBus. In this paper, using the case study of the widely used Supermicro X11SSL motherboard, we show how remotely exploitable software weaknesses in the BMC (or other processors with PMBus access) can be used to access the PMBus and then perform hardware-based fault injection attacks on the main CPU. The underlying weaknesses include insecure firmware encryption and signing mechanisms, a lack of authentication for the firmware upgrade process and the IPMI KCS control interface, as well as the motherboard design (with the PMBus connected to the BMC and SMBus by default). First, we show that undervolting through the PMBus allows breaking the integrity guarantees of SGX enclaves, bypassing Intel's countermeasures against previous undervolting attacks like Plundervolt/V0ltPwn. Second, we experimentally show that overvolting outside the specified range has the potential of permanently damaging Intel Xeon CPUs, rendering the server inoperable. W
First, under a geometric ergodicity assumption, we provide some limit theorems and some probability inequalities for the bifurcating Markov chains (BMC). The BMC model was introduced by Guyon to detect cellular aging from cell lineage, and our aim is thus to complete his asymptotic results. The deviation inequalities are then applied to derive first result on the moderate deviation principle (MDP) for a functional of the BMC with a restricted range of speed, but with a function which can be unbounded. Next, under a uniform geometric ergodicity assumption, we provide deviation inequalities for the BMC and apply them to derive a second result on the MDP for a bounded functional of the BMC with a larger range of speed. As statistical applications, we provide superexponential convergence in probability and deviation inequalities (for either the Gaussian setting or the bounded setting), and the MDP for least square estimators of the parameters of a first-order bifurcating autoregressive process.