Despite the impressive capabilities of Large Language Models (LLMs) on various tasks, they still struggle with scenarios that involves complex reasoning and planning. Recent work proposed advanced prompting techniques and the necessity of fine-tuning with high-quality data to augment LLMs' reasoning abilities. However, these approaches are inherently constrained by data availability and quality. In light of this, self-correction and self-learning emerge as viable solutions, employing strategies that allow LLMs to refine their outputs and learn from self-assessed rewards. Yet, the efficacy of LLMs in self-refining its response, particularly in complex reasoning and planning task, remains dubious. In this paper, we introduce AlphaLLM for the self-improvements of LLMs, which integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop, thereby enhancing the capabilities of LLMs without additional annotations. Drawing inspiration from the success of AlphaGo, AlphaLLM addresses the unique challenges of combining MCTS with LLM for self-improvement, including data scarcity, the vastness search spaces of language tasks, and the subjective nature of feedback in lan
In this paper we proposed a novel Adversarial Training (AT) approach for end-to-end speech recognition using a Criticizing Language Model (CLM). In this way the CLM and the automatic speech recognition (ASR) model can challenge and learn from each other iteratively to improve the performance. Since the CLM only takes the text as input, huge quantities of unpaired text data can be utilized in this approach within end-to-end training. Moreover, AT can be applied to any end-to-end ASR model using any deep-learning-based language modeling frameworks, and compatible with any existing end-to-end decoding method. Initial results with an example experimental setup demonstrated the proposed approach is able to gain consistent improvements efficiently from auxiliary text data under different scenarios.
The commercial was submitted by the Freedom of the Press Foundation to run during Donald Trump’s UFC event。 It criticized the $111 billion merger as a threat to the First Amendment
Criticality has been proposed as a key principle underlying complex behavior in biological and artificial systems; however, how criticality translates from individual dynamics to collective behavior remains unclear. We study this question using a multi-agent system with spatially constrained interactions in which agents sense neighboring light signals through exteroceptors and act by switching their own light on or off, thereby forming a dynamical interaction network at the macroscopic level. The agents' internal states are themselves governed by a reservoir dynamical system at the microscopic level. By varying the microscopic parameters around dynamical criticality, together with the macroscopic interaction topology, we systematically investigate the relation between the two levels. We find that near-critical dynamics within individual agents is not sufficient to produce collective critical-like avalanche statistics. Instead, scale-free behavior depends on the effective connectivity of the macroscopic interaction network, which controls activity propagation. As a result, macroscopic critical-like dynamics are enabled by microscopic regimes that deviate from criticality, with the r
A major unresolved question in Neuroscience is: What is the origin of the observed scale-invariant correlations in neural activity? Many researchers support the ``criticality hypothesis,'' which proposes that the brain operates near criticality, optimizing various information processing functions. However, the nature and behavior of criticality in cortical systems are still unclear. Alternatively, this opinion paper highlights that the coupling between neurons and slowly varying energetic resources, which may act as a form of ``memory,'' alone may be sufficient to generate a robust phase of neural activity with scale-invariant correlations. This memory-induced long-range order phase could provide a more natural explanation of the existing experimental data than the criticality hypothesis.
We consider the stationary problem for the quasi-geostrophic equation with the critical and super-critical dissipation and prove the unique existence of small solutions for given small external force in the scaling critical Sobolev spaces framework. Moreover, we also show that the data-to-solution map is continuous. Since the critical and super-critical case involves the derivative loss, which affects the class of the continuity of the data-to-solution map, we reveal that the map is no longer uniform continuous, in contrast to the sub-critical case, where the Lipschitz continuity holds.
Understanding the world through models is a fundamental goal of scientific research. While large language model (LLM) based approaches show promise in automating scientific discovery, they often overlook the importance of criticizing scientific models. Criticizing models deepens scientific understanding and drives the development of more accurate models. Automating model criticism is difficult because it traditionally requires a human expert to define how to compare a model with data and evaluate if the discrepancies are significant--both rely heavily on understanding the modeling assumptions and domain. Although LLM-based critic approaches are appealing, they introduce new challenges: LLMs might hallucinate the critiques themselves. Motivated by this, we introduce CriticAL (Critic Automation with Language Models). CriticAL uses LLMs to generate summary statistics that capture discrepancies between model predictions and data, and applies hypothesis tests to evaluate their significance. We can view CriticAL as a verifier that validates models and their critiques by embedding them in a hypothesis testing framework. In experiments, we evaluate CriticAL across key quantitative and qual
Criticality is a fundamental notion in graph theory that has been studied continually since its introduction in the early 50s by Dirac. A graph is called $k$-vertex-critical ($k$-edge-critical) if it is $k$-chromatic but removing any vertex (edge) lowers the chromatic number to $k-1$. A set of edges in a graph is called critical if its removal reduces the chromatic number of the graph. In 1970, Dirac conjectured a rather strong distinction between the notions of vertex- and edge-criticality, namely that for every $k\ge 4$ there exists a $k$-vertex-critical graph that does not have any critical edges. This conjecture was proved for $k\ge 5$ by Jensen in 2002 and remains open only for $k=4$. A much stronger version of Dirac's conjecture was proposed by Erdős in 1985: Let $k\ge 4$ be fixed, and let $f_k(n)$ denote the largest integer such that there exists a $k$-vertex-critical graph of order $n$ in which no set of at most $f_k(n)$ edges is critical. Is it true that $f_k(n)\rightarrow \infty$ for $n\rightarrow \infty$? Strengthening previous partial results, we solve this problem affirmatively for all $k>4$, proving that $$f_k(n)=Ω(n^{1/3}).$$ This leaves only the case $k=4$ open.
Critical systems host nontrivial entanglement structure that is generally sensitive to additional couplings. In the present work, we study the effect of weak measurements on the entanglement Hamiltonian of massless free fermions which are prepared in their critical ground state. While the power-law decaying correlation and logarithmic growing entanglement entropy have been observed as typical signatures of quantum criticality after the weak measurement, in this work we show that the conformal symmetry is lost and the entanglement Hamiltonian generally becomes gapped for arbitrary small measurement strength. To reveal this unconventional entanglement structure, we consider a field-theory description that allows us to establish an analytic mapping between the entanglement Hamiltonians before and after the weak measurements. From this mapping, we find that although the measurements lead to a significant modification of the entanglement spectrum, the real-space distribution of the eigenfunction of the kernel of entanglement Hamiltonian is unchanged, which is responsible for the coexistence of a gapped entanglement Hamiltonian and the logarithmic entanglement entropy. Moreover, as the m
Hayut and first author isolated the notion of a critical cardinal in [1]. In this work we answer several questions raised in the original paper. We show that it is consistent for a critical cardinals to not have any ultrapower elementary embeddings, as well as that it is consistent that no target model is closed. We also prove that if $κ$ is a critical point by any ultrapower embedding, then it is the critical point by a normal ultrapower embedding. The paper contains several open questions of interest in the study of critical cardinals.
As part of a chapter for a book titled "50 years of the renormalization group", dedicated to the memory of Michael E. Fisher, edited by Amnon Aharony, Ora Entin-Wohlman, David Huse, and Leo Radzihovsky, I review a class of novel ordered states of "critical matter", that exhibit strongly fluctuating universal power-law orders, controlled by an infra-red attractive, non-Gaussian fixed point. I will illustrate how RG methods pioneered by Wilson and Fisher can be used to deduce critical phenomenology of such critical phases, resembling that of a critical point of second order phase transitions, but requiring no fine tuning.
Quantum critical phenomena influences the finite temperature behavior of condensed matter systems through quantum critical fans whose extents are determined by the exponents of the zero temperature criticality. Here we emphasize the aspects of quantum critical lines, as discussed previously, and study an exactly solved model involving a transverse field Ising model with added three-spin interaction. This model has three critical lines. We compute the spin-spin correlation function and extract the correlation length, and identify the crossovers: quantum critical to quantum disordered, or renormalized classical regimes. We construct the quantum critical fans along one of the critical lines. In addition, we also construct finite temperature dynamic structure factors. We hope this model will become experimentally realizable in the future, and our results could stimulate studies in many similar models.
Large language models (LLMs) have showcased remarkable potential across various tasks by conditioning on prompts. However, the quality of different human-written prompts leads to substantial discrepancies in LLMs' performance, and improving prompts usually necessitates considerable human effort and expertise. To this end, this paper proposes Prompt with Actor-Critic Editing (PACE) for LLMs to enable automatic prompt editing. Drawing inspiration from the actor-critic algorithm in reinforcement learning, PACE leverages LLMs as the dual roles of actors and critics, conceptualizing prompt as a type of policy. PACE refines prompt, taking into account the feedback from both actors performing prompt and critics criticizing response. This process helps LLMs better align prompt to a specific task, thanks to real responses and thinking from LLMs. We conduct extensive experiments on 24 instruction induction tasks and 21 big-bench tasks. Experimental results indicate that PACE elevates the relative performance of medium/low-quality human-written prompts by up to 98\%, which has comparable performance to high-quality human-written prompts. Moreover, PACE also exhibits notable efficacy for promp
We discuss a recent work by J.~Lawrence et al.[arxiv.org/abs/2208.11793] criticizing relational quantum mechanics (RQM) and based on a famous nonlocality theorem Going back to Greenberger Horne and Zeilinger (GHZ). Here, we show that the claims presented in this recent work are unjustified and we debunk the analysis.
Critical fluctuations in fluids and fluid mixtures yield a nonanalytic asymptotic Ising-like critical thermodynamic behavior in terms of power laws with universal exponents. In polymer solutions, the amplitudes of these power laws depend on the degree of polymerization. Nonasymptotic behavior (upon the departure from the critical point) is particularly interesting in the case of polymer solutions, where it is governed by a competition between the correlation length of the critical fluctuations and the radius of gyration of the polymer molecules. If the correlation length is the dominant length scale, Ising-like critical behavior is observed. If, however, the radius of gyration exceeds the correlation length, tricritical behavior with mean-field critical exponents is observed. The Ising-like critical region shrinks with the increase of the polymer molecular weight. In the limit of an infinite degree of polymerization, the Ising-like critical region vanishes, yielding to theta-point tricriticality.
We investigate the relaxation of holographic superfluids after quenches, when the end state is either tuned to be exactly at the critical point, or very close to it. By solving the bulk equations of motion numerically, we demonstrate that in the former case the system exhibits a power law falloff as well as an emergent discrete scale invariance. The later case is in the regime dominated by critical slowing down, and we show that there is an intermediate time-range before the onset of late time exponential falloff, where the system behaves similarly to the critical point with its power law falloff. We further postulate a phenomenological Gross-Pitaevskii-like equation that is able to make quantitative predictions for the behavior of the holographic superfluid after near-critical quenches. Intriguingly, all parameters of our phenomenological equation which describes the non-linear time evolution may be fixed with information from the static equilibrium solutions and linear response theory.
The application of Bayesian networks (BNs) to cognitive assessment and intelligent tutoring systems poses new challenges for model construction. When cognitive task analyses suggest constructing a BN with several latent variables, empirical model criticism of the latent structure becomes both critical and complex. This paper introduces a methodology for criticizing models both globally (a BN in its entirety) and locally (observable nodes), and explores its value in identifying several kinds of misfit: node errors, edge errors, state errors, and prior probability errors in the latent structure. The results suggest the indices have potential for detecting model misfit and assisting in locating problematic components of the model.
The goal of causal inference is to understand the outcome of alternative courses of action. However, all causal inference requires assumptions. Such assumptions can be more influential than in typical tasks for probabilistic modeling, and testing those assumptions is important to assess the validity of causal inference. We develop model criticism for Bayesian causal inference, building on the idea of posterior predictive checks to assess model fit. Our approach involves decomposing the problem, separately criticizing the model of treatment assignments and the model of outcomes. Conditioned on the assumption of unconfoundedness---that the treatments are assigned independently of the potential outcomes---we show how to check any additional modeling assumption. Our approach provides a foundation for diagnosing model-based causal inferences.
We reply to a recent comment by Diehl and Shpot (cond-mat/0305131) criticizing a new approach to the Lifshitz critical behavior just presented (M. M. Leite Phys. Rev. B 67, 104415(2003)). We show that this approach is free of inconsistencies in the ultraviolet regime. We recall that the orthogonal approximation employed to solve arbitrary loop diagrams worked out at the criticized paper even at three-loop level is consistent with homogeneity for arbitrary loop momenta. We show that the criticism is incorrect.
Scientists have successfully tested an AI-designed universal coronavirus vaccine in humans for the first time, finding it to be safe and well tolerated。 The vaccine generated immune responses against multiple coronaviruses, including SARS-CoV-2, SARS, and related bat viruses with pandemic potential。 By targeting features shared across an entire vir