Expensive optimization tasks are ubiquitous in real-world applications, demanding highly specialized solvers. While LLM-driven automated solver generation shows promise, current paradigms face three critical issues when tackling expensive optimization: factual hallucinations due to deficient domain knowledge, the frequent dismantling of previously established locally optimal structures during refinement, and the prohibitive evaluation costs alongside restricted generalization caused by executing on training instances. To address these issues, we introduce AutoSG, a fully automated workflow directly translating natural language prompts into executable customized solvers. AutoSG features three core innovations: a retrieval-augmented solver generation module strictly grounding code in verified literature; a one-step self-refinement operator introducing task-specific improvements while preserving critical structural components; and an instance-free Elo-based LLM-as-a-Judge evaluation mechanism rapidly establishing global rankings. Extensive evaluations across diverse expensive optimization tasks confirm AutoSG significantly outperforms human-designed state-of-the-art frameworks and exi
Cost-aware routing dynamically dispatches user queries to models of varying capability to balance performance and inference cost. However, the routing strategy introduces a new security concern that adversaries may manipulate the router to consistently select expensive high-capability models. Existing routing attacks depend on either white-box access or heuristic prompts, rendering them ineffective in real-world black-box scenarios. In this work, we propose R$^2$A, which aims to mislead black-box LLM routers to expensive models via adversarial suffix optimization. Specifically, R$^2$A deploys a hybrid ensemble surrogate router to mimic the black-box router. A suffix optimization algorithm is further adapted for the ensemble-based surrogate. Extensive experiments on multiple open-source and commercial routing systems demonstrate that {R$^2$A} significantly increases the routing rate to expensive models on queries of different distributions. Code and examples: https://github.com/thcxiker/R2A-Attack.
Species detection is important for monitoring the health of ecosystems and identifying invasive species, serving a crucial role in guiding conservation efforts. Multimodal neural networks have seen increasing use for identifying species to help automate this task, but they have two major drawbacks. First, their black-box nature prevents the interpretability of their decision making process. Second, collecting genetic data is often expensive and requires invasive procedures, often necessitating researchers to capture or kill the target specimen. We address both of these problems by extending prototype networks (ProtoPNets), which are a popular and interpretable alternative to traditional neural networks, to the multimodal, cost-aware setting. We ensemble prototypes from each modality, using an associated weight to determine how much a given prediction relies on each modality. We further introduce methods to identify cases for which we do not need the expensive genetic information to make confident predictions. We demonstrate that our approach can intelligently allocate expensive genetic data for fine-grained distinctions while using abundant image data for clearer visual classificat
Expensive optimization problems (EOPs) are black-box tasks with costly objective evaluations and no gradient access, making the evaluation budget the key bottleneck. Surrogate-assisted evolutionary algorithms (SAEAs) reduce evaluations via surrogate predictions, but conventional surrogates often require frequent retraining as populations evolve, incurring overhead. This paper proposes R2SAEA, a reinforcement-trained relation-based large language model (LLM) surrogate assisted evolutionary algorithm. We cast relation-based surrogate modeling as an in-context pairwise reasoning task. To enable efficient inference in evolutionary loops, we develop an anchor-based iterative context construction strategy that reduces prompt complexity from quadratic to linear in population size, and a voting-based aggregation scheme that converts predicted relations into scores for offspring selection. We further build an RL pipeline from evolutionary trajectories and fine-tune Qwen2.5 with GRPO. Experiments on single- and multi-objective benchmarks show improved relation prediction and state-of-the-art optimization performance over strong SAEA baselines and general LLMs. Quantization also enables effic
To fully leverage the multi-task optimization paradigm for accelerating the solution of expensive scheduling problems, this study has effectively tackled three vital concerns. The primary issue is identifying auxiliary tasks that closely resemble the original expensive task. We suggested a sampling strategy based on job importance, creating a compact matrix by extracting crucial rows from the entire problem specification matrix of the expensive task. This matrix serves as an economical auxiliary task. Mathematically, we proved that this economical auxiliary task bears similarity to its corresponding expensive task. The subsequent concern revolves around making auxiliary tasks more cost-effective. We determined the sampling proportions for the entire problem specification matrix through factorial design experiments, resulting in a more compact auxiliary task. With a reduced search space and shorter function evaluation time, it can rapidly furnish high-quality transferable information for the primary task. The last aspect involves designing transferable deep information from auxiliary tasks. We regarded the job priorities in the (sub-) optimal solutions to the economical auxiliary ta
Bayesian inference typically relies on a large number of model evaluations to estimate posterior distributions. Established methods like Markov Chain Monte Carlo (MCMC) and Amortized Bayesian Inference (ABI) can become computationally challenging. While ABI enables fast inference after training, generating sufficient training data still requires thousands of model simulations, which is infeasible for expensive models. Surrogate models offer a solution by providing approximate simulations at a lower computational cost, allowing the generation of large data sets for training. However, the introduced approximation errors and uncertainties can lead to overconfident posterior estimates. To address this, we propose Uncertainty-Aware Surrogate-based Amortized Bayesian Inference (UA-SABI) -- a framework that combines surrogate modeling and ABI while explicitly quantifying and propagating surrogate uncertainties through the inference pipeline. Our experiments show that this approach enables reliable, fast, and repeated Bayesian inference for computationally expensive models, even under tight time constraints.
Across many domains of science, stochastic models are an essential tool to understand the mechanisms underlying empirically observed data. Models can be of different levels of detail and accuracy, with models of high-fidelity (i.e., high accuracy) to the phenomena under study being often preferable. However, inferring parameters of high-fidelity models via simulation-based inference is challenging, especially when the simulator is computationally expensive. We introduce a multifidelity approach to neural posterior estimation that uses transfer learning to leverage inexpensive low-fidelity simulations to efficiently infer parameters of high-fidelity simulators. Our method applies the multifidelity scheme to both amortized and non-amortized neural posterior estimation. We further improve simulation efficiency by introducing a sequential variant that uses an acquisition function targeting the predictive uncertainty of the density estimator to adaptively select high-fidelity parameters. On established benchmark and neuroscience tasks, our approaches require up to two orders of magnitude fewer high-fidelity simulations than current methods, while showing comparable performance. Overall,
Heuristic search solvers like RTDP-Bel and LAO* have proven effective for computing optimal and bounded sub-optimal solutions for Partially Observable Markov Decision Processes (POMDPs), which are typically formulated as belief MDPs. A belief represents a probability distribution over possible system states. Given a parent belief and an action, computing belief state transitions involves Bayesian updates that combine the transition and observation models of the POMDP to determine successor beliefs and their transition probabilities. However, there is a class of problems, specifically in robotics, where computing these transitions can be prohibitively expensive due to costly physics simulations, raycasting, or expensive collision checks required by the underlying transition and observation models, leading to long planning times. To address this challenge, we propose Lazy RTDP-Bel and Lazy LAO*, which defer computing expensive belief state transitions by leveraging Q-value estimation, significantly reducing planning time. We demonstrate the superior performance of the proposed lazy planners in domains such as contact-rich manipulation for pose estimation, outdoor navigation in rough
Training a state-of-the-art Large Language Model (LLM) is an increasingly expensive endeavor due to growing computational, hardware, energy, and engineering demands. Yet, an often-overlooked (and seldom paid) expense is the human labor behind these models' training data. Every LLM is built on an unfathomable amount of human effort: trillions of carefully written words sourced from books, academic papers, codebases, social media, and more. This position paper aims to assign a monetary value to this labor and argues that the most expensive part of producing an LLM should be the compensation provided to training data producers for their work. To support this position, we study 64 LLMs released between 2016 and 2024, estimating what it would cost to pay people to produce their training datasets from scratch. Even under highly conservative estimates of wage rates, the costs of these models' training datasets are 10-1000 times larger than the costs to train the models themselves, representing a significant financial liability for LLM providers. In the face of the massive gap between the value of training data and the lack of compensation for its creation, we highlight and discuss researc
Expensive multi-objective optimization is a prevalent and crucial concern in many real-world scenarios, where sample-efficiency is vital due to the limited evaluations to recover the true Pareto front for decision making. Existing works either involve rebuilding Gaussian process surrogates from scratch for each objective in each new problem encountered, or rely on extensive past domain experiments for pre-training deep learning models, making them hard to generalize and impractical to cope with various emerging applications in the real world. To address this issue, we propose a new paradigm named FoMEMO (Foundation Models for Expensive Multi-objective Optimization), which enables the establishment of a foundation model conditioned on any domain trajectory and user preference, and facilitates fast in-context optimization based on the predicted preference-wise aggregated posteriors. Rather than accessing extensive real-world domain experiments for training, we demonstrate that pre-training the foundation model with a diverse set of hundreds of millions of synthetic data can lead to superior generalization and optimization performance to unknown problems, without necessitating any sub
In expensive multi-objective optimization, where the evaluation budget is strictly limited, selecting promising candidate solutions for expensive fitness evaluations is critical for accelerating convergence and improving algorithmic performance. However, designing an optimization strategy that effectively balances convergence, diversity, and distribution remains a challenge. To tackle this issue, we propose a composite indicator-based evolutionary algorithm (CI-EMO) for expensive multi-objective optimization. In each generation of the optimization process, CI-EMO first employs NSGA-III to explore the solution space based on fitness values predicted by surrogate models, generating a candidate population. Subsequently, we design a novel composite performance indicator to guide the selection of candidates for real fitness evaluation. This indicator simultaneously considers convergence, diversity, and distribution to improve the efficiency of identifying promising candidate solutions, which significantly improves algorithm performance. The composite indicator-based candidate selection strategy is easy to achieve and computes efficiency. Component analysis experiments confirm the effect
Parametric multi-objective optimization (PMO) addresses the challenge of solving an infinite family of multi-objective optimization problems, where optimal solutions must adapt to varying parameters. Traditional methods require re-execution for each parameter configuration, leading to prohibitive costs when objective evaluations are computationally expensive. To address this issue, we propose Parametric Pareto Set Learning with multi-objective Bayesian Optimization (PPSL-MOBO), a novel framework that learns a unified mapping from both preferences and parameters to Pareto-optimal solutions. PPSL-MOBO leverages a hypernetwork with Low-Rank Adaptation (LoRA) to efficiently capture parametric variations, while integrating Gaussian process surrogates and hypervolume-based acquisition to minimize expensive function evaluations. We demonstrate PPSL-MOBO's effectiveness on two challenging applications: multi-objective optimization with shared components, where certain design variables must be identical across solution families due to modular constraints, and dynamic multi-objective optimization, where objectives evolve over time. Unlike existing methods that cannot directly solve PMO probl
Multi-objective Bayesian optimization (MOBO) has shown promising performance on various expensive multi-objective optimization problems (EMOPs). However, effectively modeling complex distributions of the Pareto optimal solutions is difficult with limited function evaluations. Existing Pareto set learning algorithms may exhibit considerable instability in such expensive scenarios, leading to significant deviations between the obtained solution set and the Pareto set (PS). In this paper, we propose a novel Composite Diffusion Model based Pareto Set Learning algorithm, namely CDM-PSL, for expensive MOBO. CDM-PSL includes both unconditional and conditional diffusion model for generating high-quality samples. Besides, we introduce an information entropy based weighting method to balance different objectives of EMOPs. This method is integrated with the guiding strategy, ensuring that all the objectives are appropriately balanced and given due consideration during the optimization process; Extensive experimental results on both synthetic benchmarks and real-world problems demonstrates that our proposed algorithm attains superior performance compared with various state-of-the-art MOBO algo
In its initial form, the expensive son hypothesis postulates that sons from male-biased sexually dimorphic species require more food during growth than daughters, which ultimately incur fitness costs for mothers predominantly producing and rearing sons. We first dissect the evolutionary framework in which the expensive son hypothesis is rooted, and we provide a critical reappraisal of its differences from other evolutionary theories proposed in the field of sex allocation. Then, we synthesize the current (and absence of) support for the costs of producing and rearing sons on maternal fitness components (future reproduction and survival). Regarding the consequences in terms of future reproduction, we highlight that species with pronounced sexual size dimorphism display a higher cost of sons than of daughters on subsequent reproductive performance, at least in mammals. However, in most studies, the relative fitness costs of producing and rearing sons and daughters can be due to sex-biased maternal allocation strategies rather than differences in energetic demands of offspring, which constitutes an alternative mechanism to the expensive son hypothesis stricto sensu. We observe that em
Expensive optimization problems (EOPs) have attracted increasing research attention over the decades due to their ubiquity in a variety of practical applications. Despite many sophisticated surrogate-assisted evolutionary algorithms (SAEAs) that have been developed for solving such problems, most of them lack the ability to transfer knowledge from previously-solved tasks and always start their search from scratch, making them troubled by the notorious cold-start issue. A few preliminary studies that integrate transfer learning into SAEAs still face some issues, such as defective similarity quantification that is prone to underestimate promising knowledge, surrogate-dependency that makes the transfer methods not coherent with the state-of-the-art in SAEAs, etc. In light of the above, a plug and play competitive knowledge transfer method is proposed to boost various SAEAs in this paper. Specifically, both the optimized solutions from the source tasks and the promising solutions acquired by the target surrogate are treated as task-solving knowledge, enabling them to compete with each other to elect the winner for expensive evaluation, thus boosting the search speed on the target task.
Optimization algorithms are very different from human optimizers. A human being would gain more experiences through problem-solving, which helps her/him in solving a new unseen problem. Yet an optimization algorithm never gains any experiences by solving more problems. In recent years, efforts have been made towards endowing optimization algorithms with some abilities of experience learning, which is regarded as experience-based optimization. In this paper, we argue that hard optimization problems could be tackled efficiently by making better use of experiences gained in related problems. We demonstrate our ideas in the context of expensive optimization, where we aim to find a near-optimal solution to an expensive optimization problem with as few fitness evaluations as possible. To achieve this, we propose an experience-based surrogate-assisted evolutionary algorithm (SAEA) framework to enhance the optimization efficiency of expensive problems, where experiences are gained across related expensive tasks via a novel meta-learning method. These experiences serve as the task-independent parameters of a deep kernel learning surrogate, then the solutions sampled from the target task are
Dynamic environments pose great challenges for expensive optimization problems, as the objective functions of these problems change over time and thus require remarkable computational resources to track the optimal solutions. Although data-driven evolutionary optimization and Bayesian optimization (BO) approaches have shown promise in solving expensive optimization problems in static environments, the attempts to develop such approaches in dynamic environments remain rarely unexplored. In this paper, we propose a simple yet effective meta-learning-based optimization framework for solving expensive dynamic optimization problems. This framework is flexible, allowing any off-the-shelf continuously differentiable surrogate model to be used in a plug-in manner, either in data-driven evolutionary optimization or BO approaches. In particular, the framework consists of two unique components: 1) the meta-learning component, in which a gradient-based meta-learning approach is adopted to learn experience (effective model parameters) across different dynamics along the optimization process. 2) the adaptation component, where the learned experience (model parameters) is used as the initial para
Pareto Set Learning (PSL) is a promising approach for approximating the entire Pareto front in multi-objective optimization (MOO) problems. However, existing derivative-free PSL methods are often unstable and inefficient, especially for expensive black-box MOO problems where objective function evaluations are costly. In this work, we propose to address the instability and inefficiency of existing PSL methods with a novel controllable PSL method, called Co-PSL. Particularly, Co-PSL consists of two stages: (1) warm-starting Bayesian optimization to obtain quality Gaussian Processes priors and (2) controllable Pareto set learning to accurately acquire a parametric mapping from preferences to the corresponding Pareto solutions. The former is to help stabilize the PSL process and reduce the number of expensive function evaluations. The latter is to support real-time trade-off control between conflicting objectives. Performances across synthesis and real-world MOO problems showcase the effectiveness of our Co-PSL for expensive multi-objective optimization tasks.
Bayesian optimization (BO) and its batch extensions are successful for optimizing expensive black-box functions. However, these traditional BO approaches are not yet ideal for optimizing less expensive functions when the computational cost of BO can dominate the cost of evaluating the blackbox function. Examples of these less expensive functions are cheap machine learning models, inexpensive physical experiment through simulators, and acquisition function optimization in Bayesian optimization. In this paper, we consider a batch BO setting for situations where function evaluations are less expensive. Our model is based on a new exploration strategy using geometric distance that provides an alternative way for exploration, selecting a point far from the observed locations. Using that intuition, we propose to use Sobol sequence to guide exploration that will get rid of running multiple global optimization steps as used in previous works. Based on the proposed distance exploration, we present an efficient batch BO approach. We demonstrate that our approach outperforms other baselines and global optimization methods when the function evaluations are less expensive.
Maximization of an expensive, unimodal function under random observations has been an important problem in hyperparameter tuning. It features expensive function evaluations (which means small budgets) and a high level of noise. We develop an algorithm based on entropy reduction of a probabilistic belief about the optimum. The algorithm provides an efficient way of estimating the computationally intractable surrogate objective in the general Entropy Search algorithm by leveraging a sampled belief model and designing a metric that measures the information value of any search point.