共找到 20 条结果
Between April 1 and May 15, 2026, a group of 49 mathematicians compiled a dataset of research-level mathematics questions with known answers. Most of the work was done during the 3-day workshop *Benchmarks in Leipzig* with 35 participants at the Max Planck Institute for Mathematics in the Sciences in Leipzig, Germany. We present the resulting collection of 100 questions. We evaluated these questions in three stages: a single attempt by five state-of-the-art LLMs, followed by a 20-runs-per-model evaluation with three of these models, and finally a 3-run attempt with two heavy-thinking models. After Stage 1, 41 questions remained completely unsolved; after Stage 2, this count dropped to 16; and we concluded Stage 3 with only 2 unsolved questions. This demonstrates that the mathematical reasoning capabilities of LLMs are becoming impressive.
This article introduces to the interactive Leipzig Corpus Miner (iLCM) - a newly released, open-source software to perform automatic content analysis. Since the iLCM is based on the R-programming language, its generic text mining procedures provided via a user-friendly graphical user interface (GUI) can easily be extended using the integrated IDE RStudio-Server or numerous other interfaces in the tool. Furthermore, the iLCM offers various possibilities to use quantitative and qualitative research approaches in combination. Some of these possibilities will be presented in more detail in the following.
This paper presents the "Leipzig Corpus Miner", a technical infrastructure for supporting qualitative and quantitative content analysis. The infrastructure aims at the integration of 'close reading' procedures on individual documents with procedures of 'distant reading', e.g. lexical characteristics of large document collections. Therefore information retrieval systems, lexicometric statistics and machine learning procedures are combined in a coherent framework which enables qualitative data analysts to make use of state-of-the-art Natural Language Processing techniques on very large document collections. Applicability of the framework ranges from social sciences to media studies and market research. As an example we introduce the usage of the framework in a political science study on post-democracy and neoliberalism.
Examining effectiveness of institutional scientific coalitions can inform future policies. This is a study on the structure of scientific collaborations in three cities in central Germany. Since 1995, the three universities of this region have formed and maintained a coalition which led to the establishment of an interdisciplinary center in 2012, i.e., German Center for Integrative Biodiversity Research (iDiv). We investigate whether the impact of the former coalition is evident in the region's structure of scientific collaborations and the scientific output of the new center. Using publications data from 1996-2018, we build co-authorship networks and identify the most cohesive communities in terms of collaboration, and compare them with communities identified based on publications presented as the scientific outcome of the coalition and new center on their website. Our results show that despite the highly cohesive structure of collaborations presented on the coalition website, there is still much potential to be realized. The newly established center has bridged the member institutions but not to a particularly strong level. We see that geographical proximity, collaboration polici
Road-traffic NO2 hotspots are still often modelled with static emissions and generic temporal profiles, although near-road concentrations respond strongly to rapidly changing traffic conditions. Here, we test whether detector-informed dynamic traffic emissions improve hyperlocal NO2 modelling relative to a conventional static baseline. To this end, we couple an online-calibrated mesoscopic traffic model (SUMO) with the LES-based urban dispersion model CAIRDIO in a nested high-resolution framework for Leipzig, Germany. We compare two otherwise identical experiment setups: a static reference simulation and a coupled simulation in which road-traffic emissions within the SUMO domain are replaced by dynamic emissions derived from simulated traffic states. The framework is designed for city-wide high-resolution application, while the present evaluation focuses on two traffic-oriented hotspot settings during two one-week periods. Compared against hourly NO2 observations of official air quality monitoring, the coupled setup performs better overall, with the clearest improvement at the street-canyon hotspot and in the representation of concentration peaks. Dynamic traffic emissions therefor
Arrhythmias are a major cause of sudden cardiac death in children, making automated rhythm classification from electrocardiograms (ECGs) clinically important. However, pediatric arrhythmia analysis remains challenging because of age-dependent waveform variability, limited data availability, and a pronounced long-tailed class distribution that hinders recognition of rare but clinically important rhythms. To address these issues, we propose a multimodal end-to-end framework that integrates surface ECG and intracardiac electrogram (IEGM) signals for pediatric arrhythmia classification. The model combines dual-branch feature encoders, attention-based cross-modal fusion, and a lightweight Transformer classifier to learn complementary electrophysiological representations. We further introduce an Adaptive Global Class-Aware Contrastive Loss (AGCACL), which incorporates prototype-based alignment, class-frequency reweighting, and globally informed hard-class modulation to improve intra-class compactness and inter-class separability under class imbalance. We evaluate the proposed method on the pediatric subset of the Leipzig Heart Center ECG-Database and establish a reproducible preprocessin
Let $G$ be a simple finite connected graph. The line graph $L(G)$ of graph $G$ is the graph whose vertices are the edges of $G$, where $ef \in E(L(G))$ when $e \cap f eq \emptyset$. Iteratively, the higher order line graphs are defined inductively as $L^1(G) = L(G)$ and $L^n(G) = L(L^{n-1}(G))$ for $n \geq 2$. In [Derived graphs and digraphs, Beitrage zur Graphentheorie (Teubner, Leipzig 1968), 17--33 (1968)], Beineke characterize line graphs in terms of nine forbidden subgraphs. Inspired by this result, in this paper, we characterize second order line graphs in terms of pure forbidden induced subgraphs. We also give a sufficient list of forbidden subgraphs for a graph $G$ such that $G$ is a higher order line graph. We characterize all order line graphs of graph $G$ with $Δ(G) = 3$ and $4$.
These are lecture notes for five lectures given at MPI Leipzig in May 2024. We study the moduli space M_{0,n} of n distinct points on P^1 as a positive geometry and a binary geometry. We develop mathematical formalism to study Cachazo-He-Yuan's scattering equations and the associated scalar and Yang-Mills amplitudes. We discuss open superstring amplitudes and relations to tropical geometry.
Deep learning algorithms require extensive data to achieve robust performance. However, data availability is often restricted in the medical domain due to patient privacy concerns. Synthetic data presents a possible solution to these challenges. Image generative models have found increasing use for medical applications, but are often task-specific, thus limiting their scalability. Moreover, existing models frequently rely on private datasets for training, which constrain their reproducibility. To address this, we introduce MediSyn: an open-access, generalist, text-guided latent diffusion model capable of generating synthetic images across 6 medical specialties and 10 imaging modalities, while being trained exclusively on publicly available data. Through extensive experimentation, we provide several key contributions. First, we demonstrate that training a generative model on visually diverse medical images does not degrade synthetic image quality. Second, we show that this generalist approach is substantially more computationally efficient than a coordinated suite of task-specific models. Third, we establish that a generalist model can produce realistic, text-aligned synthetic image
Understanding the behavior of non-human primates is crucial for improving animal welfare, modeling social behavior, and gaining insights into distinctively human and phylogenetically shared behaviors. However, the lack of datasets on non-human primate behavior hinders in-depth exploration of primate social interactions, posing challenges to research on our closest living relatives. To address these limitations, we present ChimpACT, a comprehensive dataset for quantifying the longitudinal behavior and social relations of chimpanzees within a social group. Spanning from 2015 to 2018, ChimpACT features videos of a group of over 20 chimpanzees residing at the Leipzig Zoo, Germany, with a particular focus on documenting the developmental trajectory of one young male, Azibo. ChimpACT is both comprehensive and challenging, consisting of 163 videos with a cumulative 160,500 frames, each richly annotated with detection, identification, pose estimation, and fine-grained spatiotemporal behavior labels. We benchmark representative methods of three tracks on ChimpACT: (i) tracking and identification, (ii) pose estimation, and (iii) spatiotemporal action detection of the chimpanzees. Our experim
These are lecture notes of a course taken in Leipzig 2023, spring semester. It deals with extremal combinatorics, algebraic methods and combinatorial geometry. These are not meant to be exhaustive, and do not contain many proofs that were presented in the course.
For many years, various experiments have attempted to shed light on the nature of dark matter (DM). This work investigates the possibility of using CaWO4 crystals for the direct search of spin-dependent DM interactions using the isotope 17O with a nuclear spin of 5/2. Due to the low natural abundance of 0.038%, an enrichment of the CaWO4 crystals with 17O is developed during the crystal production process at the Technical University of Munich. Three CaWO4 crystals were enriched, and their 17O content was measured by nuclear magnetic resonance spectroscopy at the University of Leipzig. This paper presents the concept and first results of the 17O enrichment and discusses the possibility of using enriched crystals to increase the sensitivity for the spin-dependent DM search with CRESST.
Cardiac MRI allows for a comprehensive assessment of myocardial structure, function and tissue characteristics. Here we describe a foundational vision system for cardiac MRI, capable of representing the breadth of human cardiovascular disease and health. Our deep-learning model is trained via self-supervised contrastive learning, in which visual concepts in cine-sequence cardiac MRI scans are learned from the raw text of the accompanying radiology reports. We train and evaluate our model on data from four large academic clinical institutions in the United States. We additionally showcase the performance of our models on the UK BioBank and two additional publicly available external datasets. We explore emergent capabilities of our system and demonstrate remarkable performance across a range of tasks, including the problem of left-ventricular ejection fraction regression and the diagnosis of 39 different conditions such as cardiac amyloidosis and hypertrophic cardiomyopathy. We show that our deep-learning system is capable of not only contextualizing the staggering complexity of human cardiovascular disease but can be directed towards clinical problems of interest, yielding impressiv
These notes are based on a series of lectures given by the author at the Max Planck Institute for Mathematics in the Sciences in Leipzig. Addressed topics include affine and projective toric varieties, abstract normal toric varieties from fans, divisors on toric varieties and Cox's construction of a toric variety as a GIT quotient. We emphasize the role of toric varieties in solving systems of polynomial equations and provide many computational examples using the Julia package Oscar.jl.
We summarise the results of RoboCup 2D Soccer Simulation League in 2016 (Leipzig), including the main competition and the evaluation round. The evaluation round held in Leipzig confirmed the strength of RoboCup-2015 champion (WrightEagle, i.e. WE2015) in the League, with only eventual finalists of 2016 competition capable of defeating WE2015. An extended, post-Leipzig, round-robin tournament which included the top 8 teams of 2016, as well as WE2015, with over 1000 games played for each pair, placed WE2015 third behind the champion team (Gliders2016) and the runner-up (HELIOS2016). This establishes WE2015 as a stable benchmark for the 2D Simulation League. We then contrast two ranking methods and suggest two options for future evaluation challenges. The first one, "The Champions Simulation League", is proposed to include 6 previous champions, directly competing against each other in a round-robin tournament, with the view to systematically trace the advancements in the League. The second proposal, "The Global Challenge", is aimed to increase the realism of the environmental conditions during the simulated games, by simulating specific features of different participating countries.
With much of our lives taking place online, researchers are increasingly turning to information from the World Wide Web to gain insights into geographic patterns and processes. Web scraping as an online data acquisition technique allows us to gather intelligence especially on social and economic actions for which the Web serves as a platform. Specific opportunities relate to near-real-time access to object-level geolocated data, which can be captured in a cost-effective way. The studied geographic phenomena include, but are not limited to, the rental market and associated processes such as gentrification, entrepreneurial ecosystems, or spatial planning processes. Since the information retrieved from the Web is not made available for that purpose, Web scraping faces several unique challenges, several of which relate to location. Ethical and legal issues mainly relate to intellectual property rights, informed consent and (geo-) privacy, and website integrity and contract. These issues also effect the practice of open science. In addition, there are technical and statistical challenges that relate to dependability and incompleteness, data inconsistencies and bias, as well as the limit
We reverse-engineer a formal semantics of the Component Definition Language (CDL), which is part of the highly configurable, embedded operating system eCos. This work provides the basis for an analysis and comparison of the two variability-modeling languages Kconfig and CDL. The semantics given in this document are based on analyzing the CDL documentation, inspecting the source code of the toolchain, as well as testing the tools on particular examples.
The taylor formula pertains historically also to johann or ivan bernoulli. Bernoulli series uiversalisima appeared in acta eruditorum in leipzig when brook taylor was nine years old. As on today one affirms two hundred fifty eight anniversary of death of johann bernoulli this article is proposed also pro memoriam. Very recent extensions of this celebrated formula are indicated in the framework of extended umbral calculus.
This work is an extensive literature review focusing on a few of the important topics in the large-scale structure of spacetime. The work is a Bachelor's thesis submitted at the Institute of Theoretical Physics, University of Leipzig. The author starts the thesis by deriving the Schwarzchild metric, Reissner Nordstorm metric and discusses their properties with appropriate Penrose diagrams. The Kerr black hole is metric is assumed, and its features are examined. The causal structure of spacetime is discussed with a focus on different stages of causality on a spacetime, Cauchy surfaces, and what does it mean for spacetime to be Globally hyperbolic. Then, the author moves to Singularity theorems, where one can find an algorithm to deal with non-coordinate singularities in spacetime. A few singularity theorems are discussed in detail. Then, the asymptotic structure of spacetime is discussed, which leads to a precise definition of a Black hole and its event horizon. The Black hole area theorem by Prof. Hawking is dissected pointwise with exclusive author comments as an attempt to simplify it for a less advanced audience. After that, a few exotic propositions in General relativity like c
The iLCM project pursues the development of an integrated research environment for the analysis of structured and unstructured data in a "Software as a Service" architecture (SaaS). The research environment addresses requirements for the quantitative evaluation of large amounts of qualitative data with text mining methods as well as requirements for the reproducibility of data-driven research designs in the social sciences. For this, the iLCM research environment comprises two central components. First, the Leipzig Corpus Miner (LCM), a decentralized SaaS application for the analysis of large amounts of news texts developed in a previous Digital Humanities project. Second, the text mining tools implemented in the LCM are extended by an "Open Research Computing" (ORC) environment for executable script documents, so-called "notebooks". This novel integration allows to combine generic, high-performance methods to process large amounts of unstructured text data and with individual program scripts to address specific research requirements in computational social science and digital humanities.