共找到 20 条结果
The winner-takes-all (WTA) process takes place on an arbitrary graph. There is an agent on each vertex of the graph, and active agents at neighboring vertices play games. In each game, a randomly chosen agent wins, while the loser is eliminated from subsequent games. The games are played at random times; each game finishes instantaneously, and the games cease when each active agent has only losers among its neighbors. On the one-dimensional lattice, the fraction of winners in the final state is $e^{-1}$, and we also determine the fractions $w_j$ of winners who won $j=0, 1, 2$ games. For the WTA process on a segment, we determine statistics of the total number of winners (the average, the variance, and all higher cumulants), the probabilities of reaching the final state with the minimum or maximum number of winners, and establish the behavior near the boundaries. For infinite regular trees with vertices of degree $d$, i.e., Bethe lattices with coordination number $d$, the fraction of winners is $(2/d)^{d/(d-2)}$.
Although language models demonstrate remarkable proficiency on mathematical benchmarks, it remains unclear whether this reflects true mathematical reasoning or statistical pattern matching over learning formal syntax. Most existing evaluations rely on symbolic problems grounded in established mathematical conventions, limiting insight into the models' ability to construct abstract concepts from first principles. In this work, we propose Math Takes Two, a new benchmark designed to assess the emergence of mathematical reasoning through communication. Motivated by the hypothesis that mathematical cognition in humans co-evolved with the need for precise communication, our benchmark tests whether two agents, without prior mathematical knowledge, can develop a shared symbolic protocol to solve a visually grounded task where the use of a numerical system facilitates extrapolation. Unlike many current datasets, our benchmark eschews predefined mathematical language, instead requiring agents to discover latent structure and representations from scratch. Math Takes Two thus provides a novel lens through which to develop and evaluate models with emergent numerical reasoning capabilities.
We prove that the trace of the Hecke operator $T_2$ acting on the vector space of cusp forms of level one takes no repeated values, except for 0, which only occurs when the space is trivial.
We study competing first passage percolation on graphs generated by the configuration model with infinite-mean degrees. Initially, two uniformly chosen vertices are infected with type 1 and type 2 infection, respectively, and the infection then spreads via nearest neighbors in the graph. The time it takes for the type 1 (resp. 2) infection to traverse an edge $e$ is given by a random variable $X_1(e)$ (resp. $X_2(e)$) and, if the vertex at the other end of the edge is still uninfected, it then becomes type 1 (resp. 2) infected and immune to the other type. Assuming that the degrees follow a power-law distribution with exponent $τ\in (1,2)$, we show that, with high probability as the number of vertices tends to infinity, one of the infection types occupies all vertices except for the starting point of the other type. Moreover, both infections have a positive probability of winning regardless of the passage times distribution. The result is also shown to hold for the erased configuration model, where self-loops are erased and multiple edges are merged, and when the degrees are conditioned to be smaller than $n^α$ for some $α> 0$.
Nearly all jurisdictions in the United States require a professional license exam, commonly referred to as "the Bar Exam," as a precondition for law practice. To even sit for the exam, most jurisdictions require that an applicant completes at least seven years of post-secondary education, including three years at an accredited law school. In addition, most test-takers also undergo weeks to months of further, exam-specific preparation. Despite this significant investment of time and capital, approximately one in five test-takers still score under the rate required to pass the exam on their first try. In the face of a complex task that requires such depth of knowledge, what, then, should we expect of the state of the art in "AI?" In this research, we document our experimental evaluation of the performance of OpenAI's `text-davinci-003` model, often-referred to as GPT-3.5, on the multistate multiple choice (MBE) section of the exam. While we find no benefit in fine-tuning over GPT-3.5's zero-shot performance at the scale of our training data, we do find that hyperparameter optimization and prompt engineering positively impacted GPT-3.5's zero-shot performance. For best prompt and para
This paper focuses on density-based clustering, particularly the Density Peak (DP) algorithm and the one based on density-connectivity DBSCAN; and proposes a new method which takes advantage of the individual strengths of these two methods to yield a density-based hierarchical clustering algorithm. Our investigation begins with formally defining the types of clusters DP and DBSCAN are designed to detect; and then identifies the kinds of distributions that DP and DBSCAN individually fail to detect all clusters in a dataset. These identified weaknesses inspire us to formally define a new kind of clusters and propose a new method called DC-HDP to overcome these weaknesses to identify clusters with arbitrary shapes and varied densities. In addition, the new method produces a richer clustering result in terms of hierarchy or dendrogram for better cluster structures understanding. Our empirical evaluation results show that DC-HDP produces the best clustering results on 14 datasets in comparison with 7 state-of-the-art clustering algorithms.
I propose the index $\hbar$ ("hbar"), defined as the number of papers of an individual that have citation count larger than or equal to the $\hbar$ of all coauthors of each paper, as a useful index to characterize the scientific output of a researcher that takes into account the effect of multiple coauthorship. The bar is higher for $\hbar$.
Amid concerns about AI models’ cybersecurity capabilities, OpenAI revealed an improved version of GPT-5。5-Cyber and its “Patch the Planet” initiative to fix open-source software bugs
Accurate salary prediction is critical for bridging the information gap between employers and job seekers in modern labor markets. Existing approaches predominantly yield a single point estimate and treat job attributes such as location, occupation, and industry as independent categorical features, ignoring both the inherent uncertainty and multi-modality of real-world compensation data and the rich hierarchical and semantic-similarity relationships that govern pay norms. In this paper we propose GAT-MDN, a unified framework that addresses both limitations simultaneously. For each of the three attribute domains we construct a domain-specific graph whose edges encode (i) hierarchical parent-child containment and (ii) weighted similarity links derived from a pre-trained Sentence-Transformer. Parallel Graph Attention Networks (GATs) with edge-feature-aware attention learn rich, context-sensitive node representations from these multi-relational graphs. A priority-based hierarchical selection module then assembles a composite feature vector that gracefully handles missing or coarse attributes, and a Mixture Density Network (MDN) head maps this vector to the parameters of a Gaussian Mixt
Community detection is a critical tool for understanding the mesoscopic structure of large-scale networks. However, when applied to aggregated or coarse-grained social networks, disjoint community partitions cannot capture the diverse composition of community memberships within aggregated nodes. While existing mixed membership methods alleviate this issue, they may detect communities that are highly sensitive to the aggregation resolution, not reliably reflecting the community structure of the underlying individual-level network. This paper presents the Link Fraction Mixed Membership (LFMM) method, which computes the mixed memberships of nodes in aggregated networks. Unlike existing mixed membership methods, LFMM is consistent under aggregation. Specifically, we show that it conserves community membership sums at different scales. The method is utilized to study a population-scale social network of the Netherlands, aggregated at different resolutions. Experiments reveal variation in community membership across different geographical regions and evolution over the last decade. In particular, we show how our method identifies large urban hubs that act as the melting pots of diverse,
Population-level dynamics of social cohesion and its underlying mechanisms remain difficult to study. In this paper, we propose a network approach to measure the evolution of social cohesion at the population scale and identify mechanisms driving the change. We use twelve annual snapshots (2010-2021) of a population-scale social network from the Netherlands linking all residents through family, household, work, school, and neighbor relations. Results show that over this period, social cohesion, quantified as average closure in the network, declines by more than 15%. We demonstrate that the decline is not due to changes in demographic composition, but to rewiring in individual ego networks. Statistical models confirm a decreasing overlap of social contexts and greater geographical mobility as drivers. Residential relocation, however, temporarily increases closure, suggesting that local cohesion-seeking behavior can yield global network fragmentation, with implications for policies related to housing, urban planning, and social integration.
Community detection is a fundamental task in complex network analysis. Fairness-aware community detection seeks to prevent biased node partitions, typically framed in terms of individual fairness, which requires similar nodes to be treated similarly, and group fairness, which aims to avoid disadvantaging specific groups of nodes. While existing literature on fair community detection has primarily focused on group fairness, we introduce a novel measure to quantify individual fairness in community detection methods. The proposed measure captures unfairness as the vectorial distance between a node's true and predicted community representations, computed using the community co-occurrence matrix. We provide a comprehensive empirical investigation of a broad set of community detection algorithms from the literature on both synthetic networks, with varying levels of community explicitness, and real-world networks. We particularly investigate the fairness-performance trade-off using standard quality metrics and compare individual fairness outcomes with existing group fairness measures. The results show that individual unfairness can occur even when group fairness or clustering accuracy is
With the introduction of large-scale network data, including population-scale social networks, techniques for privacy-aware sharing of network data become increasingly important. While existing $k$-anonymity approaches can model different attacker scenarios, they typically assume that attacker knowledge exactly matches the published network structure. We argue that exact knowledge is often unrealistic and introduce $φ$-$k$-anonymity, a fuzzy variant of $k$-anonymity in which parameter $φ$ captures the level of uncertainty in attacker knowledge. Across a benchmark of $39$ real-world networks, a realistic level of uncertainty ($φ=5\%$) renders, on average, $64\%$ of previously unique nodes anonymous. To further enhance anonymity, we apply anonymization algorithms under a $5\%$ edge modification budget. While full anonymization is often unattainable under exact $k$-anonymity, with low uncertainty ($φ=10\%$) our newly proposed Greedy algorithm anonymizes over $99\%$ of the nodes. Uncertainty also enables effective anonymization in otherwise difficult to anonymize dense synthetic graphs. Additionally, data utility in terms of structural properties and performance on network analysis tas
Automation significantly alters human behavior, particularly risk-taking. Previous researches have paid limited attention to the underlying characteristics of automation and their mechanisms of influence on risk-taking. This study investigated how automation affects risk-taking and examined the role of sense of agency therein. By quantifying sense of agency through subjective ratings, this research explored the impact of automation level and reliability level on risk-taking. The results of three experiments indicated that automation reduced the level of risk-taking; higher automation level was associated with lower sense of agency and lower risk-taking, with sense of agency playing a complete mediating role; higher automation reliability was associated with higher sense of agency and higher risk-taking, with sense of agency playing a partial mediating role. The study concludes that automation influences risk-taking, such that higher automation level or lower reliability is associated with a lower likelihood of risk-taking. Sense of agency mediates the impact of automation on risk-taking, and automation level and reliability have different effects on risk-taking.
The promise of equal opportunity is a cornerstone of modern societies, yet upward economic mobility remains out of reach for many. Using a decade of population-scale social network data from the Netherlands, covering over a billion family, school, workplace, and neighborhood ties, we examine how structural inequality and social capital jointly shape economic trajectories. Parental background is a strong early predictor of economic outcomes, but its influence fades over time. In contrast, bridging social capital is what positively predicts long-term mobility, particularly for economically disadvantaged groups. Reducing the dimensionality of an individual's network composition, we identify two key dimensions: exposure to affluent contacts and socioeconomic diversity of one's network. These are sufficient to capture the core aspects of social capital that matter for economic mobility. Overall, our findings demonstrate that while inherited advantage shapes the starting point of economic trajectory, social capital can powerfully reshape it, especially for the poor.
Social networks may contain privacy-sensitive information about individuals. The objective of the network anonymization problem is to alter a given social network dataset such that the number of anonymous nodes in the social graph is maximized. Here, a node is anonymous if it does not have a unique surrounding network structure. At the same time, the aim is to ensure data utility, i.e., preserve topological network properties and retain good performance on downstream network analysis tasks. We propose two versions of a genetic algorithm tailored to this problem: one generic GA and a uniqueness-aware GA (UGA). The latter aims to target edges more effectively during mutation by avoiding edges connected to already anonymous nodes. After hyperparameter tuning, we compare the two GAs against two existing baseline algorithms on several real-world network datasets. Results show that the proposed genetic algorithms manage to anonymize on average 14 times more nodes than the best baseline algorithm. Additionally, data utility experiments demonstrate how the UGA requires fewer edge deletions, and how our GAs and the baselines retain performance on downstream tasks equally well. Overall, our
Understanding community structures is crucial for analyzing networks, as nodes join communities that collectively shape large-scale networks. In real-world settings, the formation of communities is often impacted by several social factors, such as ethnicity, gender, wealth, or other attributes. These factors may introduce structural inequalities; for instance, real-world networks can have a few majority groups and many minority groups. Community detection algorithms, which identify communities based on network topology, may generate unfair outcomes if they fail to account for existing structural inequalities, particularly affecting underrepresented groups. In this work, we propose a set of novel group fairness metrics to assess the fairness of community detection methods. Additionally, we conduct a comparative evaluation of the most common community detection methods, analyzing the trade-off between performance and fairness. Experiments are performed on synthetic networks generated using LFR, ABCD, and HICH-BA benchmark models, as well as on real-world networks. Our results demonstrate that the fairness-performance trade-off varies widely across methods, with no single class of app
Team science dominates scientific knowledge production, but what makes academic teams successful? Using temporal data on 25.2 million publications and 31.8 million authors, we propose a novel network-driven approach to identify and study the success of persistent teams. Challenging the idea that persistence alone drives success, we find that team freshness - new collaborations built on prior experience - is key to success. High impact research tends to emerge early in a team's lifespan. Analyzing complex team overlap, we find that teams open to new collaborative ties consistently produce better science. Specifically, team re-combinations that introduce new freshness impulses sustain success, while persistence impulses from experienced teams are linked to earlier impact. Together, freshness and persistence shape team success across collaboration stages.
Recent studies have shown that visually impaired people have desires to take selfies in the same way as sighted people do to record their photos and share them with others. Although support applications using sound and vibration have been developed to help visually impaired people take selfies using smartphone cameras, it is still difficult to capture everyone in the angle of view, and it is also difficult to confirm that they all have good expressions in the photo. To mitigate these issues, we propose a method to take selfies with multiple people using an omni-directional camera. Specifically, a user takes a few seconds of video with an omni-directional camera, followed by face detection on all frames. The proposed method then eliminates false face detections and complements undetected ones considering the consistency across all frames. After performing facial expression recognition on all the frames, the proposed method finally extracts the frame in which the participants are happiest, and generates a perspective projection image in which all the participants are in the angle of view from the omni-directional frame. In experiments, we use several scenes with different number of p
Cliques, groups of fully connected nodes in a network, are often used to study group dynamics of complex systems. In real-world settings, group dynamics often have a temporal component. For example, conference attendees moving from one group conversation to another. Recently, maximal clique enumeration methods have been introduced that add temporal (and frequency) constraints, to account for such phenomena. These methods enumerate so called (delta,gamma)-maximal cliques. In this work, we introduce an efficient (delta,gamma)-maximal clique enumeration algorithm, that extends gamma from a frequency constraint to a more versatile weighting constraint. Additionally, we introduce a definition of (delta,gamma)-cliques, that resolves a problem of existing definitions in the temporal domain. Our approach, which was inspired by a state-of-the-art two-phase approach, introduces a more efficient initial (stretching) phase. Specifically, we reduce the time complexity of this phase to be linear with respect to the number of temporal edges. Furthermore, we introduce a new approach to the second (bulking) phase, which allows us to efficiently prune search tree branches. Consequently, in experimen