共找到 20 条结果
We use logistic regression to estimate the value of the pieces in standard chess and several chess variants, namely Chess 960, Atomic chess, Antichess, and Horde chess. We perform our regressions on several years of data from Lichess, the free and open-source internet chess server. We use the published player ratings to control for the confounding effect of differential player skill. We adjust for the attenuation bias in regressions due to the noise in observed ratings. We find that major piece values, relative to the value of a pawn, are fairly consistent with historical valuation systems. However we find slightly higher value to bishops than knights. We find that piece values are smaller, in absolute value, in Atomic and Antichess than standard chess. We also present approximate values of the pieces to equalize odds when players of varying skill face off. We briefly consider self-play experiments using the Stockfish engine, which give a contrasting view of piece value.
We prove that chess played on the infinite chessboard $\mathbb{Z}^2$ with infinitely many pieces is as powerful as it could possibly be, by showing that every open Gale-Stewart game with draws is strategically equivalent to some infinite chess position and vice versa. As our construction is computable and open Gale-Stewart games are well understood, this allows us to resolve many open questions about the complexity of infinite chess with infinitely many pieces. In particular, all countable ordinals arise as the game value of some such chess position. We also give an alternate construction that realizes all countable ordinals as game values, with the pleasing property that it consists only of the king pair and pawns.
The chess domain is well-suited for creating an artificial intelligence (AI) system that mimics real-world challenges, including decision-making. Throughout the years, minimal attention has been paid to investigating insights derived from unstructured chess data sources. In this study, we examine the complicated relationships between multiple referenced moves in a chess-teaching textbook, and propose a novel method designed to encapsulate chess knowledge derived from move-action phrases. This study investigates the feasibility of using a modified sentiment analysis method as a means for evaluating chess moves based on text. Our proposed Aspect-Based Sentiment Analysis (ABSA) method represents an advancement in evaluating the sentiment associated with referenced chess moves. By extracting insights from move-action phrases, our approach aims to provide a more fine-grained and contextually aware `chess move'-based sentiment classification. Through empirical experiments and analysis, we evaluate the performance of our fine-tuned ABSA model, presenting results that confirm the efficiency of our approach in advancing aspect-based sentiment classification within the chess domain. This res
Fog of War chess is a popular variant of classical chess, in which both players have only partial information about the position of the opponent's pieces. This study provides the first theoretical analysis of endgames in Fog of War chess. In particular, we analyze the setups king and queen versus king, king and rook versus king, and king and two rooks versus king. We show that a king and queen can always guarantee a win against a lone king. In contrast to classical chess, a king and a rook cannot guarantee a win against a lone king. However, adding one more rook guarantees a win.
This study addresses the challenge of quantifying chess puzzle difficulty - a complex task that combines elements of game theory and human cognition and underscores its critical role in effective chess training. We present GlickFormer, a novel transformer-based architecture that predicts chess puzzle difficulty by approximating the Glicko-2 rating system. Unlike conventional chess engines that optimize for game outcomes, GlickFormer models human perception of tactical patterns and problem-solving complexity. The proposed model utilizes a modified ChessFormer backbone for spatial feature extraction and incorporates temporal information via factorized transformer techniques. This approach enables the capture of both spatial chess piece arrangements and move sequences, effectively modeling spatio-temporal relationships relevant to difficulty assessment. Experimental evaluation was conducted on a dataset of over 4 million chess puzzles. Results demonstrate GlickFormer's superior performance compared to the state-of-the-art ChessFormer baseline across multiple metrics. The algorithm's performance has also been recognized through its competitive results in the IEEE BigData 2024 Cup: Pred
Chess recognition is the task of extracting the chess piece configuration from a chessboard image. Current approaches use a pipeline of separate, independent, modules such as chessboard detection, square localization, and piece classification. Instead, we follow the deep learning philosophy and explore an end-to-end approach to directly predict the configuration from the image, thus avoiding the error accumulation of the sequential approaches and eliminating the need for intermediate annotations. Furthermore, we introduce a new dataset, Chess Recognition Dataset (ChessReD), that consists of 10,800 real photographs and their corresponding annotations. In contrast to existing datasets that are synthetically rendered and have only limited angles, ChessReD has photographs captured from various angles using smartphone cameras; a sensor choice made to ensure real-world applicability. Our approach in chess recognition on the introduced challenging benchmark dataset outperforms related approaches, successfully recognizing the chess pieces' configuration in 15.26% of ChessReD's test images. This accuracy may seem low, but it is ~7x better than the current state-of-the-art and reflects the d
Contemporary chess engines offer precise yet opaque evaluations, typically expressed as centipawn scores. While effective for decision-making, these outputs obscure the underlying contributions of individual pieces or patterns. In this paper, we explore adapting SHAP (SHapley Additive exPlanations) to the domain of chess analysis, aiming to attribute a chess engines evaluation to specific pieces on the board. By treating pieces as features and systematically ablating them, we compute additive, per-piece contributions that explain the engines output in a locally faithful and human-interpretable manner. This method draws inspiration from classical chess pedagogy, where players assess positions by mentally removing pieces, and grounds it in modern explainable AI techniques. Our approach opens new possibilities for visualization, human training, and engine comparison. We release accompanying code and data to foster future research in interpretable chess AI.
We introduce LLM CHESS, an evaluation framework designed to probe the generalization of reasoning and instruction-following abilities in large language models (LLMs) through extended agentic interaction in the domain of chess. We rank over 50 open and closed source models by playing against a random opponent using a range of behavioral metrics, including win and loss rates, move quality, move legality, hallucinated actions, and game duration. For a subset of top reasoning models, we derive an Elo estimate by playing against a chess engine with variably configured skill, which allows for comparisons between models in an easily understandable way. Despite the simplicity of the instruction-following task and the weakness of the opponent, many state-of-the-art models struggle to complete games or achieve consistent wins. Similar to other benchmarks on complex reasoning tasks, our experiments reveal a clear separation between reasoning and non-reasoning models. However, unlike existing static benchmarks, the stochastic and dynamic nature of LLM CHESS uniquely reduces overfitting and memorization while preventing benchmark saturation, proving difficult even for top reasoning models. To s
Chess has long been a testbed for AI's quest to match human intelligence, and in recent years, chess AI systems have surpassed the strongest humans at the game. However, these systems are not human-aligned; they are unable to match the skill levels of all human partners or model human-like behaviors beyond piece movement. In this paper, we introduce Allie, a chess-playing AI designed to bridge the gap between artificial and human intelligence in this classic game. Allie is trained on log sequences of real chess games to model the behaviors of human chess players across the skill spectrum, including non-move behaviors such as pondering times and resignations In offline evaluations, we find that Allie exhibits humanlike behavior: it outperforms the existing state-of-the-art in human chess move prediction and "ponders" at critical positions. The model learns to reliably assign reward at each game state, which can be used at inference as a reward function in a novel time-adaptive Monte-Carlo tree search (MCTS) procedure, where the amount of search depends on how long humans would think in the same positions. Adaptive search enables remarkable skill calibration; in a large-scale online
Chess engines passed human strength years ago, but they still don't play like humans. A grandmaster under clock pressure blunders in ways a club player on a hot streak never would. Conventional engines capture none of this. This paper proposes a personality x psyche decomposition to produce behavioral variability in chess play, drawing on patterns observed in human games. Personality is static -- a preset that pins down the engine's character. Psyche is dynamic -- a bounded scalar ψ_t \in [-100, +100], recomputed from five positional factors after every move. These two components feed into an audio-inspired signal chain (noise gate, compressor/expander, five-band equalizer, saturation limiter) that reshapes move probability distributions on the fly. The chain doesn't care what engine sits behind it: any system that outputs move probabilities will do. It needs no search and carries no state beyond ψ_t. I test the framework across 12,414 games against Maia2-1100, feeding it two probability sources that differ by ~2,800x in training data. Both show the same monotonic gradient in top-move agreement (~20-25 pp spread from stress to overconfidence), which tells us the behavioral variatio
AI research in chess has been primarily focused on producing stronger agents that can maximize the probability of winning. However, there is another aspect to chess that has largely gone unexamined: its aesthetic appeal. Specifically, there exists a category of chess moves called ``brilliant" moves. These moves are appreciated and admired by players for their high intellectual aesthetics. We demonstrate the first system for classifying chess moves as brilliant. The system uses a neural network, using the output of a chess engine as well as features that describe the shape of the game tree. The system achieves an accuracy of 79% (with 50% base-rate), a PPV of 83%, and an NPV of 75%. We demonstrate that what humans perceive as ``brilliant" moves is not merely the best possible move. We show that a move is more likely to be predicted as brilliant, all things being equal, if a weaker engine considers it lower-quality (for the same rating by a stronger engine). Our system opens the avenues for computer chess engines to (appear to) display human-like brilliance, and, hence, creativity.
Reasoning is a central capability of human intelligence. In recent years, with the advent of large-scale datasets, pretrained large language models have emerged with new capabilities, including reasoning. However, these models still struggle with long-term, complex reasoning tasks, such as playing chess. Based on the observation that expert chess players employ a dual approach combining long-term strategic play with short-term tactical play along with language explanation, we propose improving the reasoning capability of large language models in chess by integrating annotated strategy and tactic. Specifically, we collect a dataset named MATE, which consists of 1 million chess positions with candidate moves annotated by chess experts for strategy and tactics. We finetune the LLaMA-3-8B model and compare it against state-of-the-art commercial language models in the task of selecting better chess moves. Our experiments show that our models perform better than GPT, Claude, and Gemini models. We find that language explanations can enhance the reasoning capability of large language models.
Cheating in chess, by using advice from powerful software, has become a major problem, reaching the highest levels. As opposed to the large majority of previous work, which concerned {\em detection} of cheating, here we try to evaluate the possible gain in performance, obtained by cheating a limited number of times during a game. We develop threshold-based and Bellman-style intervention policies, and test them in a controlled engine-vs-engine setting using Stockfish. A judicious choice of 1 or 2 cheats yields average scores of 0.71 and 0.82, respectively, compared to 0.51 with no cheats. We also introduce a fast, engine-free simulator that enables hyperparameter optimization without running games, closely matching the engine-based optimum. The goal of this work is not to assist cheaters, but to measure the effectiveness of cheating -- which is crucial as part of the effort to contain and detect it.
Chess game position analysis is important in improving ones game. It requires entry of moves into a chess engine which is, cumbersome and error prone. We present ARChessAnalyzer, a complete pipeline from live image capture of a physical chess game, to board and piece recognition, to move analysis and finally to Augmented Reality (AR) overlay of the chess diagram position and move on the physical board. ARChessAnalyzer is like a scene analyzer - it uses an ensemble of traditional image and vision techniques to segment the scene (ie the chess game) and uses Convolution Neural Networks (CNNs) to predict the segmented pieces and combine it together to analyze the game. This paper advances the state of the art in the first of its kind end to end integration of robust detection and segmentation of the board, chess piece detection using the fine-tuned AlexNet CNN and chess engine analyzer in a handheld device app. The accuracy of the entire chess position prediction pipeline is 93.45\% and takes 3-4.5sec from live capture to AR overlay. We also validated our hypothesis that ARChessAnalyzer, is faster at analysis than manual entry for all board positions for valid outcomes. Our hope is tha
Learning chess strategies has been investigated widely, with most studies focussing on learning from previous games using search algorithms. Chess textbooks encapsulate grandmaster knowledge, explain playing strategies and require a smaller search space compared to traditional chess agents. This paper examines chess textbooks as a new knowledge source for enabling machines to learn how to play chess -- a resource that has not been explored previously. We developed the LEAP corpus, a first and new heterogeneous dataset with structured (chess move notations and board states) and unstructured data (textual descriptions) collected from a chess textbook containing 1164 sentences discussing strategic moves from 91 games. We firstly labelled the sentences based on their relevance, i.e., whether they are discussing a move. Each relevant sentence was then labelled according to its sentiment towards the described move. We performed empirical experiments that assess the performance of various transformer-based baseline models for sentiment analysis. Our results demonstrate the feasibility of employing transformer-based sentiment analysis models for evaluating chess moves, with the best perfor
Berlekamp proposed a class of impartial combinatorial games based on the moves of chess pieces on rectangular boards. We generalize impartial chess games by playing them on Young diagrams and obtain results about winning and losing positions and Sprague-Grundy values for all chess pieces. We classify these games, and their restrictions to sets of partitions known as rectangles, staircases, and general staircases, according to the approach of Conway, later extended by Gurvich and Ho. The games $\rm {R\small OOK}$ and $\rm{Q\small UEEN}$ restricted to rectangles are known to have the same game tree as $2$-pile $\rm N{\small IM}$ and $\rm W{\small YTHOFF}$, respectively, so our work generalizes these well-known games.
Chess graphs encode the moves that a particular chess piece can make on an $m\times n$ chessboard. We study through these graphs through the lens of chip-firing games and graph gonality. We provide upper and lower bounds for the gonality of king's, bishop's, and knight's graphs, as well as for the toroidal versions of these graphs. We also prove that among all chess graphs, there exists an upper bound on gonality solely in terms of $\min\{m,n\}$, except for queen's, toroidal queen's, rook's, and toroidal bishop's graphs.
The game of chess as always been viewed as an iconic representation of intellectual prowess. Since the very beginning of computer science, the challenge of being able to program a computer capable of playing chess and beating humans has been alive and used both as a mark to measure hardware/software progresses and as an ongoing programming challenge leading to numerous discoveries. In the early days of computer science it was a topic for specialists. But as computers were democratized, and the strength of chess engines began to increase, chess players started to appropriate to themselves these new tools. We show how these interactions between the world of chess and information technologies have been herald of broader social impacts of information technologies. The game of chess, and more broadly the world of chess (chess players, literature, computer softwares and websites dedicated to chess, etc.), turns out to be a surprisingly and particularly sharp indicator of the changes induced in our everyday life by the information technologies. Moreover, in the same way that chess is a modelization of war that captures the raw features of strategic thinking, chess world can be seen as sma
We present SentiMATE, a novel end-to-end Deep Learning model for Chess, employing Natural Language Processing that aims to learn an effective evaluation function assessing move quality. This function is pre-trained on the sentiment of commentary associated with the training moves and is used to guide and optimize the agent's game-playing decision making. The contributions of this research are three-fold: we build and put forward both a classifier which extracts commentary describing the quality of Chess moves in vast commentary datasets, and a Sentiment Analysis model trained on Chess commentary to accurately predict the quality of said moves, to then use those predictions to evaluate the optimal next move of a Chess agent. Both classifiers achieve over 90 % classification accuracy. Lastly, we present a Chess engine, SentiMATE, which evaluates Chess moves based on a pre-trained sentiment evaluation function. Our results exhibit strong evidence to support our initial hypothesis - "Can Natural Language Processing be used to train a novel and sample efficient evaluation function in Chess Engines?" - as we integrate our evaluation function into modern Chess engines and play against age
Hundreds of years after its creation, the game of chess is still widely played worldwide. Opening Theory is one of the pillars of chess and requires years of study to be mastered. Here we exploit the "wisdom of the crowd" in an online chess platform to answer questions that, traditionally, only chess experts could tackle. We first define the relatedness network of chess openings that quantifies how similar two openings are to play. In this network, we spot communities of nodes corresponding to the most common opening choices and their mutual relationships, information which is hard to obtain from the existing classification of openings. Moreover, we use the relatedness network to forecast the future openings players will start to play and we back-test these predictions, obtaining performances considerably higher than those of a random predictor. Finally, we use the Economic Fitness and Complexity algorithm to measure how difficult to play openings are and how skilled in openings players are. This study not only gives a new perspective on chess analysis but also opens the possibility of suggesting personalized opening recommendations using complex network theory.