共找到 20 条结果
Effective instruction in tutoring requires promptly providing instructional materials that match the needs of each student (e.g., in response to questions). In this study, we introduce an agent that automatically delivers supplementary materials on demand during one-on-one tutoring sessions. Our agent uses a multimodal large language model to analyze spoken dialogue between the instructor and the student, automatically generate search queries, and retrieve relevant Web images. Evaluation experiments demonstrate that our agent reduces the average image retrieval time by 44.4 s compared to cases without support and successfully provides images of acceptable quality in 85.7% of trials. These results indicate that our agent effectively supports instructors during tutoring sessions.
Despite the growing use of large language models (LLMs) for providing feedback, limited research has explored how to achieve high-quality feedback. This case study introduces an evaluation framework to assess different zero-shot prompt engineering methods. We varied the prompts systematically and analyzed the provided feedback on programming errors in R. The results suggest that prompts suggesting a stepwise procedure increase the precision, while omitting explicit specifications about which provided data to analyze improves error identification.
In this paper we build a case for providing job completion time predictions to cloud users, similar to the delivery date of a package or arrival time of a booked ride. Our analysis reveals that providing predictability can come at the expense of performance and fairness. Existing cloud scheduling systems optimize for extreme points in the trade-off space, making them either extremely unpredictable or impractical. To address this challenge, we present PCS, a new scheduling framework that aims to provide predictability while balancing other traditional objectives. The key idea behind PCS is to use Weighted-Fair-Queueing (WFQ) and find a suitable configuration of different WFQ parameters (e.g., class weights) that meets specific goals for predictability. It uses a simulation-aided search strategy, to efficiently discover WFQ configurations that lie on the Pareto front of the trade-off space between these objectives. We implement and evaluate PCS in the context of DNN job scheduling on GPUs. Our evaluation, on a small scale GPU testbed and larger-scale simulations, shows that PCS can provide accurate completion time estimates while marginally compromising on performance and fairness.
In their recent paper, Rosen, Takeyama, Tasaka, and Yamamoto constructed recurrent sequences providing a decomposition law of primes in a Galois extension. In this paper, we reconstruct their sequences via representation theory of finite groups and obtain an explicit description of the sequences.
As autonomous systems become more complex and integral in our society, the need to accurately model and safely control these systems has increased significantly. In the past decade, there has been tremendous success in using deep learning techniques to model and control systems that are difficult to model using first principles. However, providing safety assurances for such systems remains difficult, partially due to the uncertainty in the learned model. In this work, we aim to provide safety assurances for systems whose dynamics are not readily derived from first principles and, hence, are more advantageous to be learned using deep learning techniques. Given the system of interest and safety constraints, we learn an ensemble model of the system dynamics from data. Leveraging ensemble uncertainty as a measure of uncertainty in the learned dynamics model, we compute a maximal robust control invariant set, starting from which the system is guaranteed to satisfy the safety constraints under the condition that realized model uncertainties are contained in the predefined set of admissible model uncertainty. We demonstrate the effectiveness of our method using a simulated case study with
An emerging definition of fairness in machine learning requires that models are oblivious to demographic user information, e.g., a user's gender or age should not influence the model. Personalized recommender systems are particularly prone to violating this definition through their explicit user focus and user modelling. Explicit user modelling is also an aspect that makes many recommender systems incapable of providing hitherto unseen users with recommendations. We propose novel approaches for mitigating discrimination in Variational Autoencoder-based recommender systems by limiting the encoding of demographic information. The approaches are capable of, and evaluated on, providing users that are not represented in the training data with fair recommendations.
Recent advancements in large language models, such as ChatGPT, have demonstrated significant potential to impact various aspects of human life. However, ChatGPT still faces challenges in providing reliable and accurate answers to user questions. To better understand the model's particular weaknesses in providing truthful answers, we embark an in-depth exploration of open-domain question answering. Specifically, we undertake a detailed examination of ChatGPT's failures, categorized into: comprehension, factuality, specificity, and inference. We further pinpoint factuality as the most contributing failure and identify two critical abilities associated with factuality: knowledge memorization and knowledge recall. Through experiments focusing on factuality, we propose several potential enhancement strategies. Our findings suggest that augmenting the model with granular external knowledge and cues for knowledge recall can enhance the model's factuality in answering questions.
Smart home technology is part of our everyday lives, and this technology is fast-evolving compared to other technologies. The user's feedback is gathered in this paper by conducting expert interviews on how collecting the feedback from the smart home devices will be helpful to improve the devices. We are yet to know about the feedback system of the smart home devices and how provided feedback will support increasing the devices' requirements. Today, we present our analysis from our exploratory interview method with the student of a certain group, and we try to study the attitude of providing feedback. The results suggested that the users are ready to give their feedback very actively to better their usage as every user has their own needs to fulfill.
For effective collaboration between humans and intelligent agents that employ machine learning for decision-making, humans must understand what agents can and cannot do to avoid over/under-reliance. A solution to this problem is adjusting human reliance through communication using reliance calibration cues (RCCs) to help humans assess agents' capabilities. Previous studies typically attempted to calibrate reliance by continuously presenting RCCs, and when an agent should provide RCCs remains an open question. To answer this, we propose Pred-RC, a method for selectively providing RCCs. Pred-RC uses a cognitive reliance model to predict whether a human will assign a task to an agent. By comparing the prediction results for both cases with and without an RCC, Pred-RC evaluates the influence of the RCC on human reliance. We tested Pred-RC in a human-AI collaboration task and found that it can successfully calibrate human reliance with a reduced number of RCCs.
Recent research in the social sciences has identified situations in which small changes in the way that information is provided to consumers can have large aggregate effects on behavior. This has been promoted in popular media in areas of public health and wellness, but its application to other areas has not been broadly studied. This paper presents a simple model which expresses the effect of providing commuters with carefully-curated information regarding aggregate traffic "slowdowns" on the various roads in a transportation network. Much of the work on providing information to commuters focuses specifically on travel-time information. However, the model in the present paper allows a system planner to provide slowdown information as well; that is, commuters are additionally told how much slower each route is as compared to its uncongested state. We show that providing this additional information can improve equilibrium routing efficiency when compared to the case when commuters are only given information about travel time, but that these improvements in congestion are not universal. That is, transportation networks exist on which any provision of slowdown information can harm equ
The limit order book mechanism has been the core trading mechanism of the modern financial market. In the cryptocurrency market, centralized exchanges also adopt this limit order book mechanism and a centralized matching engine dynamically connects the traders to the orders of market makers. Recently, decentralized exchanges have been introduced and received considerable attention in the cryptocurrency community. A decentralized exchange typically adopts an automated market maker, which algorithmically arbitrates the trades between liquidity providers and traders through a pool of crypto assets. Meanwhile, the liquidity of the exchange is the most important factor when traders choose an exchange. However, the amount of liquidity provided by the liquidity providers in decentralized exchanges is insufficient when compared to centralized exchanges. This is because the liquidity providers in decentralized exchanges suffer from the risk of divergence loss inherent to the automated market making system. To this end, we introduce a new concept called margin liquidity and leverage this concept to propose a highly profitable margin liquidity-providing position. Then, we extend this margin l
Residential Thermostatically Controlled Loads (TCLs) such as Air Conditioners (ACs), heat pumps, water heaters, and refrigerators have an enormous thermal storage potential for providing regulation reserve to the grid. In this paper, we study the potential resource and economic analysis of TCLs providing frequency regulation service. In particular, we show that the potential resource of TCLs in California is more than enough for both current and predicted near-future regulation requirements for the California power system. Moreover, we estimate the cost and revenue of TCLs, discuss the qualification requirements, recommended policy changes, and participation incentive methods, and compare TCLs with other energy storage technologies. We show that TCLs are potentially more cost-effective than other energy storage technologies such as flywheels, Li-ion, advanced lead acid, and Zinc Bromide batteries.
Given a point query Q in multi-dimensional space, K-Nearest Neighbor (KNN) queries return the K closest answers according to given distance metric in the database with respect to Q. In this scenario, it is possible that a majority of the answers may be very similar to some other, especially when the data has clusters. For a variety of applications, such homogeneous result sets may not add value to the user. In this paper, we consider the problem of providing diversity in the results of KNN queries, that is, to produce the closest result set such that each answer is sufficiently different from the rest. We first propose a user-tunable definition of diversity, and then present an algorithm, called MOTLEY, for producing a diverse result set as per this definition. Through a detailed experimental evaluation on real and synthetic data, we show that MOTLEY can produce diverse result sets by reading only a small fraction of the tuples in the database. Further, it imposes no additional overhead on the evaluation of traditional KNN queries, thereby providing a seamless interface between diversity and distance.
Recent enhancements have been proposed to the ATM Unspecified Bit Rate (UBR) service that guarantee a minimum rate at the frame level to the UBR VCs. These enhancements have been called Guaranteed Frame Rate (GFR). In this paper, we discuss the motivation, design and implementation issues for GFR. We present the design of buffer management and policing mechanisms to implement GFR. We study the effects of policing, per-VC buffer allocation, and per-VC queuing on providing GFR to TCP/IP traffic. We conclude that per-VC scheduling is necessary to provide minimum rate guarantees to TCP traffic. We examine the role of frame tagging in the presence of scheduling and buffer management for providing minumum rate guarantees. The use of GFR to support the Internet Controlled Load Service is also discussed.
Providing public access to unprotected digital data can pose a threat of unwanted disclosing the restricted information. The problem of protecting such information can be divided into two main subclasses, namely, individual and group data anonymity. By group anonymity we define protecting important data patterns, distributions, and collective features which cannot be determined through analyzing individual records only. An effective and comparatively simple way of solving group anonymity problem is doubtlessly applying wavelet transform. It's easy-to-implement, powerful enough, and might produce acceptable results if used properly. In the paper, we present a novel method of using wavelet transform for providing group anonymity; it is gained through redistributing wavelet approximation values, along with simultaneous fixing data mean value and leaving wavelet details unchanged (or proportionally altering them). Moreover, we provide a comprehensive example to illustrate the method.
The goal of this paper is to investigate the importance of providing visual "big pictures" in the teaching of economics. The plurality and variety of concepts, variables, diagrams, and models involved in economics can be a source of confusion for many economics students. However, reviewing the existing literature on the importance of providing visual "big pictures" in the process of learning suggests that furnishing students with a visual "big picture" that illustrates the ways through which those numerous, diverse concepts are connected to each other could be an effective solution to clear up the mentioned mental chaos. As a practical example, this paper introduces a "big picture" that can be used as a good resource in intermediate macroeconomics classes. This figure presents twenty-seven commonly-discussed macroeconomic diagrams in the intermediate macroeconomics course, and gives little detail on some of these diagrams, aiming at helping students to get the whole picture at once on a single piece of paper. This macroeconomics big picture mostly focuses on the routes through which common diagrams in macroeconomics are connected to each other, and finally introduces the general ma
A univariate polynomial equation is presented. It provides models of the thermal lattice Boltzmann equation. The models can be accurate up to any required level and can be applied to regular lattices, which allow efficient and accurate approximate solutions of the Boltzmann equation. We derive models satisfying the complete Galilean invariant and providing accuracy of the 4th-order moment and beyond. We simulate thermal shock tube problems to illustrate the accuracy of our models and to show the remarkably enhanced stability obtained by our models and our discretized equilibrium distributions.
Solar and Fuel cell energy resources are two major of Renewable Energy Resources (RES) which is using to power electrical grids. They can provide clean energy and help to reduce greenhouse emissions. RES is changing the structure of electrical systems day by day. They have provided to supply the loads and have distributed systems. The concept of Microgrid has been defined by connecting distributed generation sources and load which can run in islanding and interconnected modes. In this research, solar and fuel cell are used as a source for providing variable loads. They try to deliver defined power to the load. The solar system created by one diode corresponding circuit is given and a Perturbation and observation (P&O) approach is applied to obtain full power from the solar panels. Direct Methanol Fuel Cell Model (DFMC) is another renewable energy resource that is utilized in this microgrid to provide the load. This model has included two Gibbs reactors that considered for the anode and cathode respectively and a splitter among the anode and cathode. The droop controller is used to control the injected power from the resources to the loads. The effectiveness of the designed micr
Machine learning predictors have been increasingly applied in production settings, including in one of the world's largest hiring platforms, Hired, to provide a better candidate and recruiter experience. The ability to provide actionable feedback is desirable for candidates to improve their chances of achieving success in the marketplace. Until recently, however, methods aimed at providing actionable feedback have been limited in terms of realism and latency. In this work, we demonstrate how, by applying a newly introduced method based on Generative Adversarial Networks (GANs), we are able to overcome these limitations and provide actionable feedback in real-time to candidates in production settings. Our experimental results highlight the significant benefits of utilizing a GAN-based approach on our dataset relative to two other state-of-the-art approaches (including over 1000x latency gains). We also illustrate the potential impact of this approach in detail on two real candidate profile examples.
An LLM is stable if it reaches the same conclusion when asked the identical question multiple times. We find leading LLMs like gpt-4o, claude-3.5, and gemini-1.5 are unstable when providing answers to hard legal questions, even when made as deterministic as possible by setting temperature to 0. We curate and release a novel dataset of 500 legal questions distilled from real cases, involving two parties, with facts, competing legal arguments, and the question of which party should prevail. When provided the exact same question, we observe that LLMs sometimes say one party should win, while other times saying the other party should win. This instability has implications for the increasing numbers of legal AI products, legal processes, and lawyers relying on these LLMs.