共找到 20 条结果
Between April 1 and May 15, 2026, a group of 49 mathematicians compiled a dataset of research-level mathematics questions with known answers. Most of the work was done during the 3-day workshop *Benchmarks in Leipzig* with 35 participants at the Max Planck Institute for Mathematics in the Sciences in Leipzig, Germany. We present the resulting collection of 100 questions. We evaluated these questions in three stages: a single attempt by five state-of-the-art LLMs, followed by a 20-runs-per-model evaluation with three of these models, and finally a 3-run attempt with two heavy-thinking models. After Stage 1, 41 questions remained completely unsolved; after Stage 2, this count dropped to 16; and we concluded Stage 3 with only 2 unsolved questions. This demonstrates that the mathematical reasoning capabilities of LLMs are becoming impressive.
This article introduces to the interactive Leipzig Corpus Miner (iLCM) - a newly released, open-source software to perform automatic content analysis. Since the iLCM is based on the R-programming language, its generic text mining procedures provided via a user-friendly graphical user interface (GUI) can easily be extended using the integrated IDE RStudio-Server or numerous other interfaces in the tool. Furthermore, the iLCM offers various possibilities to use quantitative and qualitative research approaches in combination. Some of these possibilities will be presented in more detail in the following.
This paper presents the "Leipzig Corpus Miner", a technical infrastructure for supporting qualitative and quantitative content analysis. The infrastructure aims at the integration of 'close reading' procedures on individual documents with procedures of 'distant reading', e.g. lexical characteristics of large document collections. Therefore information retrieval systems, lexicometric statistics and machine learning procedures are combined in a coherent framework which enables qualitative data analysts to make use of state-of-the-art Natural Language Processing techniques on very large document collections. Applicability of the framework ranges from social sciences to media studies and market research. As an example we introduce the usage of the framework in a political science study on post-democracy and neoliberalism.
Examining effectiveness of institutional scientific coalitions can inform future policies. This is a study on the structure of scientific collaborations in three cities in central Germany. Since 1995, the three universities of this region have formed and maintained a coalition which led to the establishment of an interdisciplinary center in 2012, i.e., German Center for Integrative Biodiversity Research (iDiv). We investigate whether the impact of the former coalition is evident in the region's structure of scientific collaborations and the scientific output of the new center. Using publications data from 1996-2018, we build co-authorship networks and identify the most cohesive communities in terms of collaboration, and compare them with communities identified based on publications presented as the scientific outcome of the coalition and new center on their website. Our results show that despite the highly cohesive structure of collaborations presented on the coalition website, there is still much potential to be realized. The newly established center has bridged the member institutions but not to a particularly strong level. We see that geographical proximity, collaboration polici
Road-traffic NO2 hotspots are still often modelled with static emissions and generic temporal profiles, although near-road concentrations respond strongly to rapidly changing traffic conditions. Here, we test whether detector-informed dynamic traffic emissions improve hyperlocal NO2 modelling relative to a conventional static baseline. To this end, we couple an online-calibrated mesoscopic traffic model (SUMO) with the LES-based urban dispersion model CAIRDIO in a nested high-resolution framework for Leipzig, Germany. We compare two otherwise identical experiment setups: a static reference simulation and a coupled simulation in which road-traffic emissions within the SUMO domain are replaced by dynamic emissions derived from simulated traffic states. The framework is designed for city-wide high-resolution application, while the present evaluation focuses on two traffic-oriented hotspot settings during two one-week periods. Compared against hourly NO2 observations of official air quality monitoring, the coupled setup performs better overall, with the clearest improvement at the street-canyon hotspot and in the representation of concentration peaks. Dynamic traffic emissions therefor
Arrhythmias are a major cause of sudden cardiac death in children, making automated rhythm classification from electrocardiograms (ECGs) clinically important. However, pediatric arrhythmia analysis remains challenging because of age-dependent waveform variability, limited data availability, and a pronounced long-tailed class distribution that hinders recognition of rare but clinically important rhythms. To address these issues, we propose a multimodal end-to-end framework that integrates surface ECG and intracardiac electrogram (IEGM) signals for pediatric arrhythmia classification. The model combines dual-branch feature encoders, attention-based cross-modal fusion, and a lightweight Transformer classifier to learn complementary electrophysiological representations. We further introduce an Adaptive Global Class-Aware Contrastive Loss (AGCACL), which incorporates prototype-based alignment, class-frequency reweighting, and globally informed hard-class modulation to improve intra-class compactness and inter-class separability under class imbalance. We evaluate the proposed method on the pediatric subset of the Leipzig Heart Center ECG-Database and establish a reproducible preprocessin
These are lecture notes for five lectures given at MPI Leipzig in May 2024. We study the moduli space M_{0,n} of n distinct points on P^1 as a positive geometry and a binary geometry. We develop mathematical formalism to study Cachazo-He-Yuan's scattering equations and the associated scalar and Yang-Mills amplitudes. We discuss open superstring amplitudes and relations to tropical geometry.
Let $G$ be a simple finite connected graph. The line graph $L(G)$ of graph $G$ is the graph whose vertices are the edges of $G$, where $ef \in E(L(G))$ when $e \cap f eq \emptyset$. Iteratively, the higher order line graphs are defined inductively as $L^1(G) = L(G)$ and $L^n(G) = L(L^{n-1}(G))$ for $n \geq 2$. In [Derived graphs and digraphs, Beitrage zur Graphentheorie (Teubner, Leipzig 1968), 17--33 (1968)], Beineke characterize line graphs in terms of nine forbidden subgraphs. Inspired by this result, in this paper, we characterize second order line graphs in terms of pure forbidden induced subgraphs. We also give a sufficient list of forbidden subgraphs for a graph $G$ such that $G$ is a higher order line graph. We characterize all order line graphs of graph $G$ with $Δ(G) = 3$ and $4$.
Deep learning algorithms require extensive data to achieve robust performance. However, data availability is often restricted in the medical domain due to patient privacy concerns. Synthetic data presents a possible solution to these challenges. Image generative models have found increasing use for medical applications, but are often task-specific, thus limiting their scalability. Moreover, existing models frequently rely on private datasets for training, which constrain their reproducibility. To address this, we introduce MediSyn: an open-access, generalist, text-guided latent diffusion model capable of generating synthetic images across 6 medical specialties and 10 imaging modalities, while being trained exclusively on publicly available data. Through extensive experimentation, we provide several key contributions. First, we demonstrate that training a generative model on visually diverse medical images does not degrade synthetic image quality. Second, we show that this generalist approach is substantially more computationally efficient than a coordinated suite of task-specific models. Third, we establish that a generalist model can produce realistic, text-aligned synthetic image
For many years, various experiments have attempted to shed light on the nature of dark matter (DM). This work investigates the possibility of using CaWO4 crystals for the direct search of spin-dependent DM interactions using the isotope 17O with a nuclear spin of 5/2. Due to the low natural abundance of 0.038%, an enrichment of the CaWO4 crystals with 17O is developed during the crystal production process at the Technical University of Munich. Three CaWO4 crystals were enriched, and their 17O content was measured by nuclear magnetic resonance spectroscopy at the University of Leipzig. This paper presents the concept and first results of the 17O enrichment and discusses the possibility of using enriched crystals to increase the sensitivity for the spin-dependent DM search with CRESST.
These are lecture notes of a course taken in Leipzig 2023, spring semester. It deals with extremal combinatorics, algebraic methods and combinatorial geometry. These are not meant to be exhaustive, and do not contain many proofs that were presented in the course.
Cardiac MRI allows for a comprehensive assessment of myocardial structure, function and tissue characteristics. Here we describe a foundational vision system for cardiac MRI, capable of representing the breadth of human cardiovascular disease and health. Our deep-learning model is trained via self-supervised contrastive learning, in which visual concepts in cine-sequence cardiac MRI scans are learned from the raw text of the accompanying radiology reports. We train and evaluate our model on data from four large academic clinical institutions in the United States. We additionally showcase the performance of our models on the UK BioBank and two additional publicly available external datasets. We explore emergent capabilities of our system and demonstrate remarkable performance across a range of tasks, including the problem of left-ventricular ejection fraction regression and the diagnosis of 39 different conditions such as cardiac amyloidosis and hypertrophic cardiomyopathy. We show that our deep-learning system is capable of not only contextualizing the staggering complexity of human cardiovascular disease but can be directed towards clinical problems of interest, yielding impressiv
Understanding the behavior of non-human primates is crucial for improving animal welfare, modeling social behavior, and gaining insights into distinctively human and phylogenetically shared behaviors. However, the lack of datasets on non-human primate behavior hinders in-depth exploration of primate social interactions, posing challenges to research on our closest living relatives. To address these limitations, we present ChimpACT, a comprehensive dataset for quantifying the longitudinal behavior and social relations of chimpanzees within a social group. Spanning from 2015 to 2018, ChimpACT features videos of a group of over 20 chimpanzees residing at the Leipzig Zoo, Germany, with a particular focus on documenting the developmental trajectory of one young male, Azibo. ChimpACT is both comprehensive and challenging, consisting of 163 videos with a cumulative 160,500 frames, each richly annotated with detection, identification, pose estimation, and fine-grained spatiotemporal behavior labels. We benchmark representative methods of three tracks on ChimpACT: (i) tracking and identification, (ii) pose estimation, and (iii) spatiotemporal action detection of the chimpanzees. Our experim
With much of our lives taking place online, researchers are increasingly turning to information from the World Wide Web to gain insights into geographic patterns and processes. Web scraping as an online data acquisition technique allows us to gather intelligence especially on social and economic actions for which the Web serves as a platform. Specific opportunities relate to near-real-time access to object-level geolocated data, which can be captured in a cost-effective way. The studied geographic phenomena include, but are not limited to, the rental market and associated processes such as gentrification, entrepreneurial ecosystems, or spatial planning processes. Since the information retrieved from the Web is not made available for that purpose, Web scraping faces several unique challenges, several of which relate to location. Ethical and legal issues mainly relate to intellectual property rights, informed consent and (geo-) privacy, and website integrity and contract. These issues also effect the practice of open science. In addition, there are technical and statistical challenges that relate to dependability and incompleteness, data inconsistencies and bias, as well as the limit
We reverse-engineer a formal semantics of the Component Definition Language (CDL), which is part of the highly configurable, embedded operating system eCos. This work provides the basis for an analysis and comparison of the two variability-modeling languages Kconfig and CDL. The semantics given in this document are based on analyzing the CDL documentation, inspecting the source code of the toolchain, as well as testing the tools on particular examples.
We summarise the results of RoboCup 2D Soccer Simulation League in 2016 (Leipzig), including the main competition and the evaluation round. The evaluation round held in Leipzig confirmed the strength of RoboCup-2015 champion (WrightEagle, i.e. WE2015) in the League, with only eventual finalists of 2016 competition capable of defeating WE2015. An extended, post-Leipzig, round-robin tournament which included the top 8 teams of 2016, as well as WE2015, with over 1000 games played for each pair, placed WE2015 third behind the champion team (Gliders2016) and the runner-up (HELIOS2016). This establishes WE2015 as a stable benchmark for the 2D Simulation League. We then contrast two ranking methods and suggest two options for future evaluation challenges. The first one, "The Champions Simulation League", is proposed to include 6 previous champions, directly competing against each other in a round-robin tournament, with the view to systematically trace the advancements in the League. The second proposal, "The Global Challenge", is aimed to increase the realism of the environmental conditions during the simulated games, by simulating specific features of different participating countries.
In this paper we discuss various problems, associated to temporal phenomena. These problems include persistence and change, the integration of objects and processes, and truth-makers for temporal propositions. We propose an approach which interprets persistence as a phenomenon emanating from the activity of the mind, and which, additionally, postulates that persistence, finally, rests on personal identity. The General Formal Ontology (GFO) is a top level ontology being developed at the University of Leipzig. Top level ontologies can be roughly divided into 3D-ontologies, and 4D-ontologies. GFO is the only top level ontology, used in applications, which is a 4D-ontology admitting additionally 3D objects. Objects and processes are integrated in a natural way.
This is the author's Ph.D. thesis, submitted to the University of Leipzig. It deals with the $L^2$ Riemannian metric on the manifold of all smooth Riemannian metrics on a fixed closed, finite-dimensional manifold. The main body of the thesis is a description of the completion manifold of metrics with respect to the $L^2$ metric. The primary motivation for studying this problem comes from Teichmueller theory, where similar considerations lead to a completion of the well-known Weil-Petersson metric. We give an application of the main theorem to the completions of Teichmueller space with respect to a class of metrics that generalize the Weil-Petersson metric. We also prove that the $L^2$ metric induces a metric space structure on the manifold of metrics. As the $L^2$ metric is a weak Riemannian metric, this fact does not follow from general results. In addition, we prove several results on the exponential mapping and distance function of a weak Riemannian metric on a Hilbert/Frechet manifold. The statements are analogous to, but weaker than, what is known in the case of a Riemannian metric on a finite-dimensional manifold or a strong Riemannian metric on a Hilbert manifold.
The first use of the stretched exponential function to describe the time evolution of a non-equilibrium quantity is usually credited to Rudolph Kohlrausch (1809-1858), who in 1854 applied it to the discharge of a capacitor. Attention is drawn to a set of pioneering works on the Kohlrausch function, one of which published 101 years ago, and that are not mentioned by Cardona, Chamberlin, and Marx in Ann. Phys. (Leipzig) 16, 842 (2007).
Cyp (Check Your Proofs) (Durner and Noschinski 2013; Traytel 2019) verifies proofs about Haskell-like programs. We extended Cyp with a pattern matcher for programs and proof terms, and a type checker. This allows to use Cyp for auto-grading exercises where the goal is to complete programs and proofs that are partially given by the instructor, as terms with holes. Since this allows holes in programs, type-checking becomes essential. Before, Cyp assumed that the program was written by a type-correct instructor, and therefore omitted type-checking of proofs. Cyp gracefully handles incomplete student submissions. It accepts holes temporarily, and checks complete subtrees fully. We present basic design decisions, make some remarks on implementation, and include example exercises from a recent course that used Cyp as part of the Leipzig Autotool auto-grading system.