共找到 20 条结果
Standard machine learning pipelines often admit many near-optimal models. These "Rashomon sets" pose a range of challenges and opportunities for uncertainty-aware, robust decision making. They allow users to incorporate domain knowledge and preferences that would otherwise be difficult to specify directly in an objective, and they quantify diversity among valid models for a given training dataset and objective function. However, computation of Rashomon sets, even for simple, interpretable model classes such as sparse decision trees, continues to require immense memory and runtime resources. We present PRAXIS, an algorithm to approximate this Rashomon set with orders of magnitude improvement in runtime and memory usage. We validate that PRAXIS regularly recovers almost all of the full Rashomon set. PRAXIS allows researchers and practitioners to scalably model the Rashomon set for real-world datasets. Code for PRAXIS is available at https://github.com/zakk-h/PRAXIS
Large language models are moving scientific research from text assistance toward agentic workflows, yet biological research requires strong object validation, methodological suitability, reproducibility, and auditability. Prompt engineering, general RAG, or tool use alone cannot reliably produce domain-specific scientific judgment. Here, we present PRAXIS, a verifiable biological research agent framework driven by literature learning and case distillation. PRAXIS converts research experience, failure boundaries, domain rules, and executable procedures into structured long-term memory. By coordinating successful cases, negative cases, rules, and skills, PRAXIS supports problem definition, object validation, method selection, workflow execution, result interpretation, and review feedback across diverse biocomputational tasks. We instantiated PRAXIS as an agent suite for biomedical computing and evaluated it through object validation, case retrieval, memory ablation, public benchmarks, and cross-agent workflows. The results show that case-based learning improves method selection, error suppression, and workflow organization in complex biological research tasks. Rather than replacing s
Unresolved production cloud incidents cost an average of over $2M per hour. This paper introduces PRAXIS, an orchestrator that manages and deploys an agentic workflow for diagnosing code- and configuration-caused cloud incidents. PRAXIS employs an LLM-driven structured traversal over two types of graph: (1) a service dependency graph (SDG) that captures microservice-level dependencies; and (2) a hammock-block program dependence graph (PDG) that captures code-level dependencies for each microservice. Compared to state-of-the-art ReAct baselines, PRAXIS improves RCA accuracy by up to 6.3x while reducing token consumption by 5.3x. PRAXIS is demonstrated on a set of 30 comprehensive real-world incidents that is being compiled into an RCA benchmark.
Vision Language Models exhibit impressive performance for various tasks, yet they often lack the sophisticated situational reasoning required for complex decision-making. This paper shows that VLMs can achieve surprisingly strong decision-making performance when visual scenes are replaced by textual descriptions, suggesting foundational reasoning can be effectively learned from language. Motivated by this insight, we propose Praxis-VLM, a reasoning VLM for vision-grounded decision-making. Praxis-VLM employs the GRPO algorithm on textual scenarios to instill robust reasoning capabilities, where models learn to evaluate actions and their consequences. These reasoning skills, acquired purely from text, successfully transfer to multimodal inference with visual inputs, significantly reducing reliance on scarce paired image-text training data. Experiments across diverse decision-making benchmarks demonstrate that Praxis-VLM substantially outperforms standard supervised fine-tuning, exhibiting superior performance and generalizability. Further analysis confirms that our models engage in explicit and effective reasoning, underpinning their enhanced performance and adaptability.
In this paper, we investigate hand gesture classifiers that rely upon the abstracted 'skeletal' data recorded using the RGB-Depth sensor. We focus on 'skeletal' data represented by the body joint coordinates, from the Praxis dataset. The PRAXIS dataset contains recordings of patients with cortical pathologies such as Alzheimer's disease, performing a Praxis test under the direction of a clinician. In this paper, we propose hand gesture classifiers that are more effective with the PRAXIS dataset than previously proposed models. Body joint data offers a compressed form of data that can be analyzed specifically for hand gesture recognition. Using a combination of windowing techniques with deep learning architecture such as a Recurrent Neural Network (RNN), we achieved an overall accuracy of 70.8% using only body joint data. In addition, we investigated a long-short-term-memory (LSTM) to extract and analyze the movement of the joints through time to recognize the hand gestures being performed and achieved a gesture recognition rate of 74.3% and 67.3% for static and dynamic gestures, respectively. The proposed approach contributed to the task of developing an automated, accurate, and in
What could designing for carbon reduction of heating and cooling in commercial settings look like in the near future? How can we challenge dominant mindsets and paradigms of efficiency and behaviour change? How can we help build worlds through our practice that can become future realities? This paper introduces the fictional consultancy ANCSTRL.LAB to explore opportunities for making space in research projects that can encourage more systems-oriented interventions. We present a design fiction that asks `what if energy management and reduction practice embraced systems thinking?'. Our design fiction explores how future energy consultancies could utilise systems thinking, and (more than) human centred design to re-imagine energy management practice and change systems in ways that are currently unfathomable. We finish by discussing how LIMITS research can utilise design fiction and speculative praxis to help build new material realities where more holistic perspectives, the leveraging of systems change, and the imagining of post-neoliberal futures is the norm.
Information power is the capacity to convert data flows into durable shifts in attention, belief, and behavior. We argue that this power has migrated from broadcast persuasion to platform-ized, data-driven operations that fuse computational delivery with cognitive effects. In this context, we define and bound information power within international relations and the information environment while demonstrating why observing and measuring it demands an integrated lens that combines politics (goals and governance), computing (data movement and algorithmic delivery), and psychology (attention, affect, memory, and belief). The article contributes three elements: (1) a triadic analytical framework that specifies the minimum variables and instrumentation needed for study; (2) two crosswalks that map common objectives (persuade, disrupt, shape) and target classes (leaders, elites, publics) to political, computational, and psychological tactics, yielding practical coding heuristics and testable hypotheses; and (3) a McCumber-style cube for information influence that integrates targets, operations, as well as machines (automation and AI) into a single space. The space provides for comparative
Explainable AI (XAI) is often promoted with the idea of helping users understand how machine learning models function and produce predictions. Still, most of these benefits are reserved for those with specialized domain knowledge, such as machine learning developers. Recent research has argued that making AI explainable can be a viable way of making AI more useful in real-world contexts, especially within low-resource domains in the Global South. While AI has transcended borders, a limited amount of work focuses on democratizing the concept of explainable AI to the "majority world", leaving much room to explore and develop new approaches within this space that cater to the distinct needs of users within culturally and socially-diverse regions. This article introduces the concept of an intercultural ethics approach to AI explainability. It examines how cultural nuances impact the adoption and use of technology, the factors that impede how technical concepts such as AI are explained, and how integrating an intercultural ethics approach in the development of XAI can improve user understanding and facilitate efficient usage of these methods.
PRAXIS is a second generation instrument that follows on from GNOSIS, which was the first instrument using fibre Bragg gratings for OH background suppression. The Bragg gratings reflect the NIR OH lines while being transparent to light between the lines. This gives a much higher signal-noise ratio at low resolution but also at higher resolutions by removing the scattered wings of the OH lines. The specifications call for high throughput and very low thermal and detector noise so that PRAXIS will remain sky noise limited. The optical train is made of fore-optics, an IFU, a fibre bundle, the Bragg grating unit, a second fibre bundle and a spectrograph. GNOSIS used the pre-existing IRIS2 spectrograph while PRAXIS will use a new spectrograph specifically designed for the fibre Bragg grating OH suppression and optimised for 1470 nm to 1700 nm (it can also be used in the 1090 nm to 1260 nm band by changing the grating and refocussing). This results in a significantly higher transmission due to high efficiency coatings, a VPH grating at low incident angle and low absorption glasses. The detector noise will also be lower. Throughout the PRAXIS design special care was taken at every step al
Fibre Bragg grating (FBG) OH suppression is capable of greatly reducing the bright sky background seen by near infrared spectrographs. By filtering out the airglow emission lines at high resolution before the light enters the spectrograph this technique prevents scattering from the emission lines into interline regions, thereby reducing the background at all wavelengths. In order to take full advantage of this sky background reduction the spectrograph must have very low instrumental backgrounds so that it remains sky noise limited. Both simulations and real world experience with the prototype GNOSIS system show that existing spectrographs, designed for higher sky background levels, will be unable to fully exploit the sky background reduction. We therefore propose PRAXIS, a spectrograph optimised specifically for this purpose. The PRAXIS concept is a fibre fed, fully cryogenic, fixed format spectrograph for the J and H-bands. Dark current will be minimised by using the best of the latest generation of NIR detectors while thermal backgrounds will be reduced by the use of a cryogenic fibre slit. Optimised spectral formats and the use of high throughput volume phase holographic grating
Learning how to do things from trial and error in real time is a hallmark of biological intelligence, yet most LLM-based agents lack mechanisms to acquire procedural knowledge after deployment. We propose Procedural Recall for Agents with eXperiences Indexed by State (PRAXIS), a lightweight post-training learning mechanism that stores the consequences of actions and retrieves them by jointly matching environmental and internal states of past episodes to the current state. PRAXIS augments agentic action selection with retrieved state-action-result exemplars that are generated in real time. When evaluated on the REAL web browsing benchmark, PRAXIS improves task completion accuracy, reliability, and cost efficiency across different foundation model backbones, and shows preliminary generalization to unseen tasks in similar environments. These results demonstrate that PRAXIS enables the practical adoption of AI agents in fast-evolving stateful environments by helping them learn new procedures effectively.
Deep learning (DL) has become a cornerstone of modern machine learning (ML) praxis. We introduce the R package mlr3torch, which is an extensible DL framework for the mlr3 ecosystem. It is built upon the torch package, and simplifies the definition, training, and evaluation of neural networks for both tabular data and generic tensors (e.g., images) for classification and regression. The package implements predefined architectures, and torch models can easily be converted to mlr3 learners. It also allows users to define neural networks as graphs. This representation is based on the graph language defined in mlr3pipelines and allows users to define the entire modeling workflow, including preprocessing, data augmentation, and network architecture, in a single graph. Through its integration into the mlr3 ecosystem, the package allows for convenient resampling, benchmarking, preprocessing, and more. We explain the package's design and features and show how to customize and extend it to new problems. Furthermore, we demonstrate the package's capabilities using three use cases, namely hyperparameter tuning, fine-tuning, and defining architectures for multimodal data. Finally, we present so
As the field of artificial intelligence (AI) and machine learning (ML) continues to prioritize fairness and the concern for historically marginalized communities, the importance of intersectionality in AI research has gained significant recognition. However, few studies provide practical guidance on how researchers can effectively incorporate intersectionality into critical praxis. In response, this paper presents a comprehensive framework grounded in critical reflexivity as intersectional praxis. Operationalizing intersectionality within the AI/DS (Artificial Intelligence/Data Science) pipeline, Quantitative Intersectional Data (QUINTA) is introduced as a methodological paradigm that challenges conventional and superficial research habits, particularly in data-centric processes, to identify and mitigate negative impacts such as the inadvertent marginalization caused by these practices. The framework centers researcher reflexivity to call attention to the AI researchers' power in creating and analyzing AI/DS artifacts through data-centric approaches. To illustrate the effectiveness of QUINTA, we provide a reflexive AI/DS researcher demonstration utilizing the \#metoo movement as a
Commercial TTS systems produce near-native Indic audio, but the best open-source bases (Chatterbox, Indic Parler-TTS, IndicF5) trail them on measured phonological dimensions, and the most widely adopted multilingual base (Chatterbox, 23 languages) does not even tokenise Telugu or Tamil. We ask: what is the minimum intervention that brings such a non-Indic-native base to commercial-class output on Telugu, Tamil, and Hindi, without training a new acoustic decoder and without any commercial TTS training data? We combine three pieces: (1) BUPS, a Brahmic Unified Phoneme Space that deterministically romanises seven Indic scripts to ISO-15919 so Chatterbox's Latin tokeniser can process them; (2) a LoRA adapter on only the text-token predictor (Chatterbox's t3), trained on ~1,220h of licensed Indic audio with a Hindi-proxy language_id; (3) a voice-prompt recovery recipe -- an 8-11s same-language reference clip plus three sampling overrides (exaggeration 0.7, temperature 0.6, min_p 0.1; "Config B") -- that recovers commercial-class acoustic output with no acoustic-decoder training. On Hindi, the LoRA regresses accuracy and we instead use vanilla Chatterbox + Config B, giving a two-branch d
Current DAO governance praxis limits organizational expressivity and reduces complex organizational decisions to token-weighted voting due to on-chain computational limits. This paper proposes verifiable off-chain computation (leveraging Verifiable Services, TEEs, and ZK proofs) as a framework to transcend these constraints while maintaining cryptoeconomic security. This paper explores three novel governance mechanisms: (1) attestation-based systems that compute multi-dimensional stakeholder legitimacy, (2) collective intelligence through verifiable preference processing, and (3) autonomous policy execution via Policy-as-Code. The framework provides architectural specifications, security models, and implementation considerations for DAOs seeking higher-resolution expressivity and increased operational efficiency, with validation from pioneering implementations demonstrating practical viability.
Why have left-wing movements historically integrated participatory art forms (such as murals and protest songs) into their praxis, while right-wing movements have prioritized strategic communication and, more recently, the digital culture of memes? This article introduces the concept of aesthetic asymmetry to explain this divergence in political action. We argue that the asymmetry is not coincidental but the result of four interconnected structural factors: the organizational ecosystem, the moral and emotional framework, the material supports, and the historical tradition of each political spectrum. While the left tends to use art in a constitutive manner to forge community, solidarity, and hope, the contemporary right tends to use it instrumentally to mobilize polarizing affects such as humor and resentment. Drawing on comparative literature from the Theatre of the Oppressed to analyses of alt-right meme wars, we nuance this distinction and show how the aesthetic logic of each pole aligns with its strategic objectives. The article culminates in a prescriptive model for artistic action, synthesizing keys to effective mobilization into emotional, narrative, and formatting strategies
As emergent artificial intelligence technologies increasingly assert roles as assistants within intangible cultural heritage contexts, researchers and artists observe existing questions on the theme of agency negotiation, cultural resistance, and technical critique. This research interrogates power dynamics in human-AI sovereignty and entanglement for nomadic improvisational Dutar performance, a living cultural heritage through a long-necked lute from the Central Asia region. To investigate tensions between human agency and computational hegemony, the researcher and artists examined and iterated a feedback workflow that captures live performance data, processes digital transformations, and creates a real-time interactive art experience via immersive environments. Empirical data from artists and audience reveal modulations where musicians selectively embrace or reject algorithmic suggestions to preserve creative identity. The author concludes that decolonial potential requires redesigning tools or systems for cultural survivance, where technology becomes not merely a feedback environment but a site for decolonial praxis, challenging computational hegemony in digital ecosystems.
Modern, data-driven medical research requires the processing of sensitive health data on a large scale. However, this data is subject to special protection under the GDPR, which is why processing regularly raises data protection concerns in practice. These concerns are particularly prevalent when sensitive personal data is processed without informed consent. This article analyses options for data processing in the field of medical research without consent and describes the legal framework for anonymisation under the GDPR, the national Austrian implementation of the research exemption, and their interaction. -- Moderne, datengetriebene medizinische Forschung erfordert die Verarbeitung sensibler Gesundheitsdaten in grossem Ausmass. Diese sind im System der DSGVO jedoch besonders geschützt, weswegen einer rechtssicheren Verarbeitung in der Praxis regelmässig datenschutzrechtliche Bedenken entgegenstehen. Diese Bedenken bestehen insbesondere bei Verarbeitung sensibler personenbezogener Daten ohne informierte Einwilligung. Dieser Beitrag analysiert daher Möglichkeiten zur Datenverarbeitung im Bereich der medizinischen Forschung fernab der Einwilligung und beschreibt hierfür das rechtlic
Algorithmic harms are commonly categorized as either allocative or representational. This study specifically addresses the latter, focusing on an examination of current definitions of representational harms to discern what is included and what is not. This analysis motivates our expansion beyond behavioral definitions to encompass harms to cognitive and affective states. The paper outlines high-level requirements for measurement: identifying the necessary expertise to implement this approach and illustrating it through a case study. Our work highlights the unique vulnerabilities of large language models to perpetrating representational harms, particularly when these harms go unmeasured and unmitigated. The work concludes by presenting proposed mitigations and delineating when to employ them. The overarching aim of this research is to establish a framework for broadening the definition of representational harms and to translate insights from fairness research into practical measurement and mitigation praxis.
Voice is a natural mode of expression offered by modern computer-based systems. Qualitative perspectives on voice-based user experiences (voice UX) offer rich descriptions of complex interactions that numbers alone cannot fully represent. We conducted a systematic review of the literature on qualitative approaches to voice UX, capturing the nature of this body of work in a systematic map and offering a qualitative synthesis of findings. We highlight the benefits of qualitative methods for voice UX research, identify opportunities for increasing rigour in methods and outcomes, and distill patterns of experience across a diversity of devices and modes of qualitative praxis.