As artificial intelligence (AI) becomes increasingly integrated into workflows, humans must decide when to rely on AI advice. These decisions depend on general efficacy beliefs, i.e., humans' confidence in their own abilities and their perceptions of AI competence. While prior work has examined factors influencing AI reliance, the role of efficacy beliefs in shaping collaboration remains underexplored. Through a controlled experiment (N=240) where participants made repeated delegation decisions, we investigate how efficacy beliefs translate into instance-wise efficacy judgments under varying contextual information. Our explorative findings reveal efficacy beliefs as persistent cognitive anchors, leading to systematic "AI optimism". Contextual information operates asymmetrically: while AI performance information selectively eliminates the AI optimism bias, data or AI information amplify how efficacy discrepancies influence delegation decisions. Although efficacy discrepancies influence delegation behavior, they show weaker effects on human-AI team performance. As these findings challenge transparency-focused approaches, we propose design guidelines for effective collaborative settin
Oncology dose-finding trials are shifting from identifying the maximum tolerated dose (MTD) to determining the optimal biological dose (OBD), driven by the need for efficient methods that consider both toxicity and efficacy. This is particularly important for novel therapies, such as immunotherapies and molecularly targeted therapies, which often exhibit non-monotonic dose-efficacy curves. However, making timely adaptive dosing decisions is challenging due to the rapid patient accrual rate and the late-onset toxicity and/or efficacy outcomes associated with these therapies. The Simple Toxicity and Efficacy Interval (STEIN) design has demonstrated strong performance in accommodating diverse dose-efficacy patterns and incorporating both toxicity and efficacy outcomes to select the OBD. However, the rapid accrual of patients and the often-delayed onset of toxicity and/or efficacy pose challenges to timely adaptive dose decisions. To address these challenges, we propose TITE-STEIN, a model-assisted design that incorporates time-to-event (TITE) outcomes for toxicity and/or efficacy, by extending STEIN. In this article, we demonstrate that TITE-STEIN significantly shortens trial duration
Complex multi-robot missions often require heterogeneous teams to jointly optimize task allocation, scheduling, and path planning to improve team performance under strict constraints. We formalize these complexities into a new class of problems, dubbed Spatio-Temporal Efficacy-optimized Allocation for Multi-robot systems (STEAM). STEAM builds upon trait-based frameworks that model robots using their capabilities (e.g., payload and speed), but goes beyond the typical binary success-failure model by explicitly modeling the efficacy of allocations as trait-efficacy maps. These maps encode how the aggregated capabilities assigned to a task determine performance. Further, STEAM accommodates spatio-temporal constraints, including a user-specified time budget (i.e., maximum makespan). To solve STEAM problems, we contribute a novel algorithm named Efficacy-optimized Incremental Task Allocation Graph Search (E-ITAGS) that simultaneously optimizes task performance and respects time budgets by interleaving task allocation, scheduling, and path planning. Motivated by the fact that trait-efficacy maps are difficult, if not impossible, to specify, E-ITAGS efficiently learns them using a realizab
Basket trials have gained increasing attention for their efficiency, as multiple patient subgroups are evaluated simultaneously. Conducted basket trials focus primarily on establishing the early efficacy of a treatment, yet continued monitoring of toxicity is essential. In this paper, we propose two Bayesian hierarchical models that enable bivariate analyses of toxicity and efficacy, while accounting for heterogeneity present in the treatment effects across patient subgroups. Specifically, one assumes the subgroup-specific toxicity and efficacy treatment effects, as a parameter vector, can be exchangeable or non-exchangeable; the other allows either the toxicity or efficacy parameters specific to the subgroups, to be exchangeable or non-exchangeable. The bivariate exchangeability and non-exchangeability distributions introduce a correlation parameter between treatment effects, while we stipulate a zero correlation when only toxicity or efficacy parameters are exchangeable. Simulation results show that our models perform robustly under different scenarios compared to the standard Bayesian hierarchical model and the stand-alone analyses, especially in producing higher power when the
Project Based Learning (PBL), recognized as an active learning strategy, has been linked to self efficacy of student in prior studies, including those within Physics Education Research. Meanwhile, technological advancements have significantly facilitated the optimization of diverse learning modes, including online learning. However, comprehensive investigations addressing questions such as how students perceive PBL and their self efficacy, what forms of PBL design they prefer, and what benefits and challenges they encounter during its implementation remain underexplored in existing literature. This study sought to uncover the experiences of ten students through longitudinal observation in an online PBL Physics class, focusing on its influence on their self efficacy within a phenomenological study. Data were collected via semi structured, in depth interviews. Social Cognitive Theory, Self Efficacy theory of Bandura, and theoretical framework for the impact of project based learning on educational outcomes were employed as guiding frameworks to shape the interpretation of students experiences and perspectives. Through Interpretative Phenomenological Analysis (IPA), the researchers id
Self-efficacy is a significant construct in education due to its predictive relationship with achievement. Existing measures of assessment-related self-efficacy concentrate on students' beliefs about content-specific tasks but omit beliefs around assessment-taking. This research aimed to develop and test the Measure of Assessment Self-Efficacy (MASE), designed to assess two types of efficacy beliefs related to assessment (i.e., 'comprehension and execution' and 'emotional regulation') in two scenarios (i.e., a low-stakes online quiz and a high-stakes final exam). Results from confirmatory factor analysis in Study 1 (N = 301) supported the hypothesised two-factor measurement models for both assessment scenarios. In Study 2, results from MGCFA (N = 277) confirmed these models were invariant over time and provided evidence for the scales' validity. Study 3 demonstrated the exam-related MASE was invariant across cohorts of students (Ns = 277; 329). Potential uses of the developed scales in educational research are discussed.
Analyzing and effectively communicating the efficacy and toxicity of treatment is the basis of risk benefit analysis (RBA). More efficient and objective tools are needed. We apply Chauhan Weighted Trajectory Analysis (CWTA) to perform RBA with superior objectivity, power, and clarity. We used CWTA to perform 1000-fold simulations of RCTs using ordinal endpoints for both treatment efficacy and toxicity. RCTs were simulated with 1:1 allocation at defined sample sizes and hazard ratios. We studied the simplest case of 3 levels each of toxicity and efficacy and the general case of the advanced cancer trial, with efficacy graded by five RECIST 1.1 health statuses and toxicity by the six-point CTCAE scale (6 x 5 matrix). The latter model was applied to a real-world dose escalation phase I trial in advanced cancer. Simulations in both the 3 x 3 and the 6 x 5 advanced cancer matrix confirmed that drugs with both superior efficacy and toxicity profiles synergize for greater statistical power with CWTA-RBA. The CWTA-RBA 6 x 5 matrix reduced sample size requirements over CWTA efficacy-only analysis. Application to the dose finding phase I clinical trial provided objective, statistically signi
Background: Despite similar education and background, programmers can exhibit vast differences in efficacy. While research has identified some potential factors, such as programming experience and domain knowledge, the effect of these factors on programmers' efficacy is not well understood. Aims: We aim at unraveling the relationship between efficacy (speed and correctness) and measures of programming experience. We further investigate the correlates of programmer efficacy in terms of reading behavior and cognitive load. Method: For this purpose, we conducted a controlled experiment with 37~participants using electroencephalography (EEG) and eye tracking. We asked participants to comprehend up to 32 Java source-code snippets and observed their eye gaze and neural correlates of cognitive load. We analyzed the correlation of participants' efficacy with popular programming experience measures. Results: We found that programmers with high efficacy read source code more targeted and with lower cognitive load. Commonly used experience levels do not predict programmer efficacy well, but self-estimation and indicators of learning eagerness are fairly accurate. Implications: The identified
In-context learners like TabPFN are promising for biomolecule efficacy prediction, where established molecular feature sets and relevant experimental results can serve as powerful contextual examples. However, their performance is highly sensitive to the provided context, making strategies like post-hoc ensembling of models trained on different data subsets a viable approach. An open question is how to select the best models for the ensemble without access to ground truth labels. In this study, we investigate an uncertainty-guided strategy for model selection. We demonstrate on an siRNA knockdown efficacy task that a TabPFN model using straightforward sequence-based features can surpass specialized state-of-the-art predictors. We also show that the model's predicted inter-quantile range (IQR), a measure of its uncertainty, has a negative correlation with true prediction error. We developed the OligoICP method, which selects and averages an ensemble of models with the lowest mean IQR for siRNA efficacy prediction, achieving superior performance compared to naive ensembling or using a single model trained on all available data. This finding highlights model uncertainty as a powerful,
Data is fundamental to the training of language models (LM). Recent research has been dedicated to data efficiency, which aims to maximize performance by selecting a minimal or optimal subset of training data. Techniques such as data filtering, sampling, and selection play a crucial role in this area. To complement it, we define Data Efficacy, which focuses on maximizing performance by optimizing the organization of training data and remains relatively underexplored. This work introduces a general paradigm, DELT, for considering data efficacy in LM training, which highlights the significance of training data organization. DELT comprises three components: Data Scoring, Data Selection, and Data Ordering. Among these components, we design Learnability-Quality Scoring (LQS), as a new instance of Data Scoring, which considers both the learnability and quality of each data sample from the gradient consistency perspective. We also devise Folding Ordering (FO), as a novel instance of Data Ordering, which addresses issues such as model forgetting and data distribution bias. Comprehensive experiments validate the data efficacy in LM training, which demonstrates the following: Firstly, variou
Significant improvements have been observed in the zero-shot capabilities of the Large Language Models (LLMs). Due to their high sensitivity to input, research has increasingly focused on enhancing LLMs' performance via direct and simple prompt engineering rather than intricate domain adaptation. Studies suggest that LLMs exhibit emotional intelligence, and both positive and negative emotions can potentially enhance task performances. However, prior interaction prompts have predominantly concentrated on a single stimulus type, neglecting to compare different stimulus effects, examine the influence of varying task difficulties, or explore underlying mechanisms. This paper, inspired by the positive correlation between self-efficacy and task performance within the social cognitive theory, introduces Verbal Efficacy Stimulations (VES). Our VES comprises three types of verbal prompts: encouraging, provocative, and critical, addressing six aspects such as helpfulness and competence. And we further categorize task difficulty, aiming to extensively investigate how distinct VES influence the self-efficacy and task achievements of language models at varied levels of difficulty. The experimen
Successful pharmaceutical drug development requires finding correct doses that provide an optimum balance between efficacy and toxicity. Competing responses to dose such as efficacy and toxicity often will increase with dose, and it is important to identify a range of doses to provide an acceptable efficacy response (minimum effective dose) while not causing unacceptable intolerance or toxicity (maximum tolerated dose). How this should be done is not self-evident. Relating efficacy to dose conditionally on possible toxicity may be problematic because whether toxicity occurs will not be known when a dose for a patient needs to be chosen. Copula models provide an appealing approach for incorporating an efficacy-toxicity association when the functional forms of the efficacy and toxicity dose-response models are known but may be less appealing in practice when the functional forms of the dose-response models and the particular copula association model are unknown. This paper explores the use of the BMA-Mod Bayesian model averaging framework that accommodates efficacy and toxicity responses to provide a statistically valid, distributionally flexible, and operationally practical model-ag
Traditional measures of vaccine efficacy (VE) are inherently asymmetric, constrained above by $1$ but unbounded below. As a result, VE estimates and corresponding confidence intervals can extend far below zero, making interpretation difficult and potentially obscuring whether the apparent effect reflects true harm or simply statistical uncertainty. The proposed symmetric vaccine efficacy (SVE) is a bounded and interpretable alternative to VE that maintains desirable statistical properties while resolving these asymmetries. SVE is defined as a symmetric transformation of infection risks, with possible values within $[-1, 1]$, providing a common scale for both beneficial and harmful vaccine effects. This paper describes the relationship between SVE and traditional VE, considers inference about SVE, and illustrates the utility of the proposed measure by reanalyzing data from a randomized trial of a candidate HIV vaccine. Open-source tools for computing estimates of SVE and corresponding confidence intervals are available in R through the sve package.
Suppose one has data from one or more completed vaccine efficacy trials and wishes to estimate the efficacy in a new setting. Often logistical or ethical considerations make running another efficacy trial impossible. Fortunately, if there is a biomarker that is the primary modifier of efficacy, then the biomarker-conditional efficacy may be identical in the completed trials and the new setting, or at least informative enough to meaningfully bound this quantity. Given a sample of this biomarker from the new population, we might hope we can bridge the results of the completed trials to estimate the vaccine efficacy in this new population. Unfortunately, even knowing the true conditional efficacy in the new population fails to identify the marginal efficacy due to the unknown conditional unvaccinated risk. We define a curve that partially identifies (lower bounds) the marginal efficacy in the new population as a function of the population's marginal unvaccinated risk, under the assumption that one can identify bounds on the conditional unvaccinated risk in the new population. Interpreting the curve only requires identifying plausible regions of the marginal unvaccinated risk in the ne
Stress impacts driving-related cognitive functions like attention and decision-making, and may arise in automated vehicles due to non-driving tasks. Unobtrusive relaxation techniques are needed to regulate stress without distracting from driving. Tactile wearables have shown efficacy in stress regulation through respiratory guidance, but individual variations may affect their efficacy. This study assessed slow-breathing tactile guidance under different stress levels on 85 participants. Physiological, behavioral and subjective data were collected. The influence of individual variations (e.g., driving habits and behavior, personality) using logistic regression analysis was explored. Participants could follow the guidance and adjust breathing while driving, but subjective efficacy depended on individual variations linked to different efficiency in using the technique, in relation with its attentional cost. An influence of factors linked to the evaluation of context criticality was also found. The results suggest that considering individual and contextual variations is crucial in designing and using such techniques in demanding driving contexts. In this line some design recommendations
Collective efficacy -- the capacity of communities to exert social control toward the realization of their shared goals -- is a foundational concept in the urban sociology and neighborhood effects literature. Traditionally, empirical studies of collective efficacy use large sample surveys to estimate collective efficacy of different neighborhoods within an urban setting. Such studies have demonstrated an association between collective efficacy and local variation in community violence, educational achievement, and health. Unlike traditional collective efficacy measurement strategies, the Adolescent Health and Development in Context (AHDC) Study implemented a new approach, obtaining spatially-referenced, place-based ratings of collective efficacy from a representative sample of individuals residing in Columbus, OH. In this paper, we introduce a novel nonstationary spatial model for interpolation of the AHDC collective efficacy ratings across the study area which leverages administrative data on land use. Our constructive model specification strategy involves dimension expansion of a latent spatial process and the use of a filter defined by the land-use partition of the study region
A fundamental mistake in receptor theory has led to an enduring misunderstanding of how to estimate the affinity and efficacy of an agonist. These properties are inextricably linked and cannot be easily separated in any case where the binding of a ligand induces a conformation change in its receptor. Consequently, binding curves and concentration-response relationships for receptor agonists have no straightforward interpretation. This problem, the affinity-efficacy problem, remains overlooked and misunderstood despite it being recognised in 1987. To avoid the further propagation of this misunderstanding, we propose that the affinity-efficacy problem should be included in the core curricula for pharmacology undergraduates proposed by the British Pharmacological Society and IUPHAR.
Benchmarking drug efficacy is a critical step in clinical trial design and planning. The challenge is that much of the data on efficacy endpoints is stored in scientific papers in free text form, so extraction of such data is currently a largely manual task. Our objective is to automate this task as much as possible. In this study we have developed and optimised a framework to extract efficacy endpoints from text in scientific papers, using a machine learning approach. Our machine learning model predicts 25 classes associated with efficacy endpoints and leads to high F1 scores (harmonic mean of precision and recall) of 96.4% on the test set, and 93.9% and 93.7% on two case studies. These methods were evaluated against - and showed strong agreement with - subject matter experts and show significant promise in the future of automating the extraction of clinical endpoints from free text. Clinical information extraction from text data is currently a laborious manual task which scales poorly and is prone to human error. Demonstrating the ability to extract efficacy endpoints automatically shows great promise for accelerating clinical trial design moving forwards.
We develop semiparametric methods for estimating subgroup-specific relative vaccine efficacy against multiple viral strains in a partially vaccinated population. Focusing on observational case-only studies, we address informative missingness in strain type due to vaccination status, pre-vaccination characteristics, and post-infection factors such as viral load. We establish general conditions for the nonparametric identification of relative conditional vaccine efficacy between strains using covariate-adjusted conditional odds ratio parameters. Assuming a log-linear parametric form for strain-specific conditional vaccine efficacy, we propose targeted maximum likelihood estimators based on partially linear logistic regression, leveraging machine learning for flexible confounding adjustment. Finally, we apply our methods to estimate relative strain-specific conditional vaccine efficacy in the ENSEMBLE COVID-19 vaccine trial.
Recently, numerous pharmaceutical sponsors have expressed a great deal of interest in the development of biosimilars, which requires clinical trials to demonstrate the equivalence of pharmacokinetics (PK) and clinical efficacy. Pharmacodynamics (PD) may be used in evaluating efficacy if there are relevant PD markers available. However, in their absence, it is necessary to design the associated clinical trials to include efficacy measures as the primary endpoint. In this study, we propose an adaptive seamless PK and efficacy design with the frameworks to remedy the risk of misspecification of both PK and efficacy parameters. Here, we consider the clinical development of biosimilars including their evaluation in patients rather than healthy volunteers under a situation where both PK and efficacy parameters are required to demonstrate the equivalence. To avoid the risk associated with the failure to confirm equivalence, incorporating the new PK trial for PK equivalence within the PK portion, which is the early stage for the efficacy part, and sample size re-calculation for the efficacy equivalence are considered in the proposed method. This proposal provides appealing advantages such