搜索 — ResearchTracker

The asymptotic behaviour of Monte Carlo Exploring Starts (MCES) is a long-standing open question in reinforcement learning, even in the tabular setting. We investigated the convergence properties of tabular MCES by constructing examples in which the algorithm converges to suboptimal solutions. This paper presents new counterexamples for both initial-visit and first-visit MCES and gives a convergence-restoring modification for the initial-visit case. We show that stable suboptimal solutions may exist for initial-visit MCES with sample-average updates even when greedy actions are updated more often than non-greedy actions on average. However, by scaling learning rates inversely to update frequencies on a state-by-state basis, convergence to optimality is guaranteed. Unlike previous uniformisation methods, this modification is applicable to large-scale problems that require approximating the estimated value function. We then extend the example to show that sample-average first-visit MCES may also converge to suboptimal solutions. This largely settles a fundamental open problem and shows that exploring starts alone do not guarantee convergence to optimality. More broadly, these results

Dynamics and non-integrability of the variable-length double pendulum: exploring chaos and periodicity via the Lyapunov refined maps

arXiv2026-02-24作者：Wojciech Szumiński, Tomasz Kapitaniak

This paper extends our previous work~(Szumiński and Maciejewski, 2024), where we explored the dynamics and integrability of the double-spring pendulum. Here, we investigate the variable-length double pendulum, a three-degree-of-freedom Hamiltonian system combining features of the classic double pendulum and the swinging Atwood machine. With its intricate dynamics, this system is crucial for studying nonlinear phenomena such as high-order resonances, chaos, and bifurcations. We address the challenges posed by high-dimensional phase spaces using a novel tool, the \textit{Lyapunov refined maps}, which integrates Poincaré sections, phase-parametric diagrams, and Lyapunov exponents. This framework comprehensively analyzes periodic, quasi-periodic, and chaotic behaviors. By measuring the strength of chaos, it also offers insights into the system's dynamical structure. Additionally, we apply Morales-Ramis theory to examine integrability, leveraging the differential Galois group of variational equations to establish non-integrability conditions. The Kovacic algorithm is used to analyze the solvability of higher-dimensional differential equations, complemented by Lyapunov exponent diagrams

搜索结果：Exploring

Exploring Starts Are Not Enough: Counterexamples and a Fix for Monte Carlo Exploring Starts

Dynamics and non-integrability of the variable-length double pendulum: exploring chaos and periodicity via the Lyapunov refined maps

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing

Exploring Exploration in Bayesian Optimization

Exploring Formal Math on the Blockchain: An Explorer for Proofgold

AdvAD: Exploring Non-Parametric Diffusion for Imperceptible Adversarial Attacks

Exploring Flow-Lenia Universes with a Curiosity-driven AI Scientist: Discovering Diverse Ecosystem Dynamics

Exploring Augmentation and Cognitive Strategies for AI based Synthetic Personae

Exploring and Applying Audio-Based Sentiment Analysis in Music

Exploring Bengali Religious Dialect Biases in Large Language Models with Evaluation Perspectives

Exploring the Landscape for Generative Sequence Models for Specialized Data Synthesis

Interstellar Photovoltaics for Exploring Alien Solar Systems

Exploring and Exploiting Data Heterogeneity in Recommendation

Less Can Be More: Exploring Population Rating Dispositions with Partitioned Models in Recommender Systems

Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving

Exploring AI Futures Through Role Play

A Co-analysis Framework for Exploring Multivariate Scientific Data

ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond

Exploring Effective Information Utilization in Multi-Turn Topic-Driven Conversations

Learning controllable dynamics through informative exploration