Politics is the set of activities related to strategic decision-making in groups. Political scientists study the strategic interactions between states, institutions, politicians, and citizens; they seek to understand the causes and consequences of those decisions and interactions. While some decisions might alleviate social problems, others might lead to disasters such as war and conflict. Data visualization approaches have the potential to assist political scientists in their studies by providing visual contexts. However, political researchers' perspectives on data visualization are unclear. This paper examines political scientists' perspectives on visualization and how they apply data visualization in their research. We discovered a growing trend in the use of graphs in political science journals. However, we also found a knowledge gap between the political science and visualization domains, such as effective visualization techniques for tasks and the use of color studied by visualization researchers. To reduce this gap, we survey visualization techniques applicable to the political scientists' research and report the visual analytics systems implemented for and evaluated by poli
The recent wave of artificial intelligence, epitomized by large language models (LLMs),has presented opportunities and challenges for methodological innovation in political science,sparking discussions on a potential paradigm shift in the social sciences. However, how can weunderstand the impact of LLMs on knowledge production and paradigm transformation in thesocial sciences from a comprehensive perspective that integrates technology and methodology? What are LLMs' specific applications and representative innovative methods in political scienceresearch? These questions, particularly from a practical methodological standpoint, remainunderexplored. This paper proposes the "Intelligent Computing Social Modeling" (ICSM) methodto address these issues by clarifying the critical mechanisms of LLMs. ICSM leverages thestrengths of LLMs in idea synthesis and action simulation, advancing intellectual exploration inpolitical science through "simulated social construction" and "simulation validation." Bysimulating the U.S. presidential election, this study empirically demonstrates the operationalpathways and methodological advantages of ICSM. By integrating traditional social scienceparadigms,
The novel coronavirus pandemic continues to ravage communities across the US. Opinion surveys identified importance of political ideology in shaping perceptions of the pandemic and compliance with preventive measures. Here, we use social media data to study complexity of polarization. We analyze a large dataset of tweets related to the pandemic collected between January and May of 2020, and develop methods to classify the ideological alignment of users along the moderacy (hardline vs moderate), political (liberal vs conservative) and science (anti-science vs pro-science) dimensions. While polarization along the science and political dimensions are correlated, politically moderate users are more likely to be aligned with the pro-science views, and politically hardline users with anti-science views. Contrary to expectations, we do not find that polarization grows over time; instead, we see increasing activity by moderate pro-science users. We also show that anti-science conservatives tend to tweet from the Southern US, while anti-science moderates from the Western states. Our findings shed light on the multi-dimensional nature of polarization, and the feasibility of tracking polarize
Generative large language models (LLMs) are incredibly useful, versatile, and promising tools. However, they will be of most use to political and social science researchers when they are used in a way that advances understanding about real human behaviors and concerns. To promote the scientific use of LLMs, we suggest that researchers in the political and social sciences need to remain focused on the scientific goal of inference. To this end, we discuss the challenges and opportunities related to scientific inference with LLMs, using validation of model output as an illustrative case for discussion. We propose a set of guidelines related to establishing the failure and success of LLMs when completing particular tasks, and discuss how we can make inferences from these observations. We conclude with a discussion of how this refocus will improve the accumulation of shared scientific knowledge about these tools and their uses in the social sciences.
In recent years, large language models (LLMs) have been widely adopted in political science tasks such as election prediction, sentiment analysis, policy impact assessment, and misinformation detection. Meanwhile, the need to systematically understand how LLMs can further revolutionize the field also becomes urgent. In this work, we--a multidisciplinary team of researchers spanning computer science and political science--present the first principled framework termed Political-LLM to advance the comprehensive understanding of integrating LLMs into computational political science. Specifically, we first introduce a fundamental taxonomy classifying the existing explorations into two perspectives: political science and computational methodologies. In particular, from the political science perspective, we highlight the role of LLMs in automating predictive and generative tasks, simulating behavior dynamics, and improving causal inference through tools like counterfactual generation; from a computational perspective, we introduce advancements in data preparation, fine-tuning, and evaluation methods for LLMs that are tailored to political contexts. We identify key challenges and future di
This paper presents a three-component work. The first component sets the overall theoretical context which lies in the argument that the increasing complexity of the world has made it more difficult for International Relations (IR) to succeed both in theory and practice. The era of information and the events of the 21st century have moved IR theory and practice away from real policy making (Walt, 2016) and have made it entrenched in opinions and political theories difficult to prove. At the same time, the rise of the "Fourth Paradigm - Data Intensive Scientific Discovery" (Hey et al., 2009) and the strengthening of data science offer an alternative: "Computational International Relations" (Unver, 2018). The use of traditional and contemporary data-centered tools can help to update the field of IR by making it more relevant to reality (Koutsoupias, Mikelis, 2020). The "wedding" between Data Science and IR is no panacea though. Changes are required both in perceptions and practices. Above all, for Data Science to enter IR, the relevant data must exist. This is where the second component comes into play. I mine the CIA World Factbook which provides cross-domain data covering all count
The basic objective of data visualization is to provide an efficient graphical display for summarizing and reasoning about quantitative information. During the last decades, political science has accumulated a large corpus of various kinds of data such as comprehensive factbooks and atlases, characterizing all or most of existing states by multiple and objectively assessed numerical indicators within certain time lapse. As a consequence, there exists a continuous trend for political science to gradually become a more quantitative scientific field and to use quantitative information in the analysis and reasoning. It is believed that any objective analysis in political science must be multidimensional and combine various sources of quantitative information; however, human capabilities for perception of large massifs of numerical information are limited. Hence, methods and approaches for visualization of quantitative and qualitative data (and, especially multivariate data) is an extremely important topic. Data visualization approaches can be classified into several groups, starting from creating informative charts and diagrams (statistical graphics and infographics) and ending with ad
Political science, and social science in general, have traditionally been using computational methods to study areas such as voting behavior, policy making, international conflict, and international development. More recently, increasingly available quantities of data are being combined with improved algorithms and affordable computational resources to predict, learn, and discover new insights from data that is large in volume and variety. New developments in the areas of machine learning, deep learning, natural language processing (NLP), and, more generally, artificial intelligence (AI) are opening up new opportunities for testing theories and evaluating the impact of interventions and programs in a more dynamic and effective way. Applications using large volumes of structured and unstructured data are becoming common in government and industry, and increasingly also in social science research. This chapter offers an introduction to such methods drawing examples from political science. Focusing on the areas where the strengths of the methods coincide with challenges in these fields, the chapter first presents an introduction to AI and its core technology - machine learning, with i
Large Language Models (LLMs) have achieved significant advances in natural language processing, yet their potential for high-stake political decision-making remains largely unexplored. This paper addresses the gap by focusing on the application of LLMs to the United Nations (UN) decision-making process, where the stakes are particularly high and political decisions can have far-reaching consequences. We introduce a novel dataset comprising publicly available UN Security Council (UNSC) records from 1994 to 2024, including draft resolutions, voting records, and diplomatic speeches. Using this dataset, we propose the United Nations Benchmark (UNBench), the first comprehensive benchmark designed to evaluate LLMs across four interconnected political science tasks: co-penholder judgment, representative voting simulation, draft adoption prediction, and representative statement generation. These tasks span the three stages of the UN decision-making process--drafting, voting, and discussing--and aim to assess LLMs' ability to understand and simulate political dynamics. Our experimental analysis demonstrates the potential and challenges of applying LLMs in this domain, providing insights int
This paper investigates the extent of political rent seeking in Hungary in the 2010s. Political capitalism--where powerful private interests influence public policy for private gain--creates opportunities for rent seeking that vary across sectors. The analysis is based on a theoretical model assuming rent seeking occurs in a three-stage process: changes in economic institutions granting regulatory privileges, which are enhanced by political-business networks; this leads to scarcities, and increased market power in certain markets; which then generates rents. To quantify this, the study evaluates Hungarian political capitalism by examining the impact of political decisions on firms' rents, analysing the profit trends of the 1,000 largest Hungarian firms (selected annually by net sales) and comparing their mean profit share (earnings before tax) across two periods: 2008-2012 and 2019-2023. A significant increase in a sector's mean profit share was assumed to indicate increased rent seeking. Using Welch's two-sample t-tests, three sectors were identified as potentially experiencing increased rent seeking: agriculture, construction, and financial and insurance activities. Quantitative
This publication presents a relation computation or calculus for international relations using a mathematical modeling. It examined trust for international relations and its calculus, which related to Bayesian inference, Dempster-Shafer theory and subjective logic. Based on an observation in the literature, we found no literature discussing the calculus method for the international relations. To bridge this research gap, we propose a relation algebra method for international relations computation. The proposed method will allow a relation computation which is previously subjective and incomputable. We also present three international relations as case studies to demonstrate the proposed method is a real-world scenario. The method will deliver the relation computation for the international relations that to support decision makers in a government such as foreign ministry, defense ministry, presidential or prime minister office. The Department of Defense (DoD) may use our method to determine a nation that can be identified as a friendly, neutral or hostile nation.
Material science literature is a rich source of factual information about various categories of entities (like materials and compositions) and various relations between these entities, such as conductivity, voltage, etc. Automatically extracting this information to generate a material science knowledge base is a challenging task. In this paper, we propose MatSciRE (Material Science Relation Extractor), a Pointer Network-based encoder-decoder framework, to jointly extract entities and relations from material science articles as a triplet ($entity1, relation, entity2$). Specifically, we target the battery materials and identify five relations to work on - conductivity, coulombic efficiency, capacity, voltage, and energy. Our proposed approach achieved a much better F1-score (0.771) than a previous attempt using ChemDataExtractor (0.716). The overall graphical framework of MatSciRE is shown in Fig 1. The material information is extracted from material science literature in the form of entity-relation triplets using MatSciRE.
We analyze international co-authorship relations in the Social Science Citation Index 2011 using all citable items in the DVD-version of this index. Network statistics indicate four groups of nations: (i) an Asian-Pacific one to which all Anglo-Saxon nations (including the UK and Ireland) are attributed; (ii) a continental European one including also the Latin-American countries; (iii) the Scandinavian nations; and (iv) a community of African nations. Within the EU-28 (including Croatia), eleven of the EU-15 states have dominant positions. Collapsing the EU-28 into a single node leads to a bi-polar structure between the US and EU-28; China is part of the US-pole. We develop an information-theoretical test to distinguish whether international collaborations or domestic collaborations prevail; the results are mixed, but the international dimension is more important than the national one in the aggregated sets (this was found in both SSCI and SCI). In France, however, the national distribution is more important than the international one, while the reverse is true for most European nations in the core group (UK, Germany, the Netherlands, etc.). Decomposition of the USA in terms of sta
Shift-share designs are gaining popularity in political science. This article introduces what shift-share designs are, reviews their application in the literature, synthesizes recent methodological developments, and discusses their potential utility in the field. Although shift-share designs have a long historical use in economics, their causal properties only recently began to be understood. Articles in political science tend to be aware of these developments, but do not fully discuss and test identifying assumptions and sometimes apply the methods incorrectly. Most articles rely on the share exogeneity framework, suggesting that the shifter exogeneity framework is underutilized despite its comparable prevalence in economics. I illustrate shifter exogeneity framework and develop auxiliary theoretical results that are potentially useful in applying the framework in political science settings.
How has the credibility revolution shaped political science? We address this question by classifying 91,632 articles published between 2003 and 2023 across 156 political science journals using large language models, focusing on research design, credibility-enhancing practices, and citation patterns. We find that design-based studies -- those leveraging plausibly exogenous variation to justify causal claims -- have become increasingly common and receive a citation premium. In contrast, model-based approaches that rely on strong modeling assumptions have declined. Yet the rise of design-based work is uneven: it is concentrated in top journals and among authors at highly ranked institutions, and it is driven primarily by the growth of survey experiments. Other credibility-enhancing practices that help reduce false positives and false negatives, such as placebo tests and power calculations, remain rare. Taken together, our findings point to substantial but selective change, more consistent with a partial reform than a revolution.
Political advertising on social media has become a central element in election campaigns. However, granular information about political advertising on social media was previously unavailable, thus raising concerns regarding fairness, accountability, and transparency in the electoral process. In this paper, we analyze targeted political advertising on social media via a unique, large-scale dataset of over 80000 political ads from Meta during the 2021 German federal election, with more than 1.1 billion impressions. For each political ad, our dataset records granular information about targeting strategies, spending, and actual impressions. We then study (i) the prevalence of targeted ads across the political spectrum; (ii) the discrepancies between targeted and actual audiences due to algorithmic ad delivery; and (iii) which targeting strategies on social media attain a wide reach at low cost. We find that targeted ads are prevalent across the entire political spectrum. Moreover, there are considerable discrepancies between targeted and actual audiences, and systematic differences in the reach of political ads (in impressions-per-EUR) among parties, where the algorithm favors ads from
Social media has reshaped political discourse, offering politicians a platform for direct engagement while reinforcing polarization and ideological divides. This study introduces a novel topic evolution framework that integrates BERTopic-based topic modeling with Moral Foundations Theory (MFT) to analyze the longevity and moral dimensions of political topics in Twitter activity during the 117th U.S. Congress. We propose a methodology for tracking dynamic topic shifts over time and measuring their association with moral values and quantifying topic persistence. Our findings reveal that while overarching themes remain stable, granular topics tend to dissolve rapidly, limiting their long-term influence. Moreover, moral foundations play a critical role in topic longevity, with Care and Loyalty dominating durable topics, while partisan differences manifest in distinct moral framing strategies. This work contributes to the field of social network analysis and computational political discourse by offering a scalable, interpretable approach to understanding moral-driven topic evolution on social media.
In response to public scrutiny of data-driven algorithms, the field of data science has adopted ethics training and principles. Although ethics can help data scientists reflect on certain normative aspects of their work, such efforts are ill-equipped to generate a data science that avoids social harms and promotes social justice. In this article, I argue that data science must embrace a political orientation. Data scientists must recognize themselves as political actors engaged in normative constructions of society and evaluate their work according to its downstream impacts on people's lives. I first articulate why data scientists must recognize themselves as political actors. In this section, I respond to three arguments that data scientists commonly invoke when challenged to take political positions regarding their work. In confronting these arguments, I describe why attempting to remain apolitical is itself a political stance--a fundamentally conservative one--and why data science's attempts to promote "social good" dangerously rely on unarticulated and incrementalist political assumptions. I then propose a framework for how data science can evolve toward a deliberative and rigo
Interest is increasing among political scientists in leveraging the extensive information available in images. However, the challenge of interpreting these images lies in the need for specialized knowledge in computer vision and access to specialized hardware. As a result, image analysis has been limited to a relatively small group within the political science community. This landscape could potentially change thanks to the rise of large language models (LLMs). This paper aims to raise awareness of the feasibility of using Gemini for image content analysis. A retrospective analysis was conducted on a corpus of 688 images. Content reports were elicited from Gemini for each image and then manually evaluated by the authors. We find that Gemini is highly accurate in performing object detection, which is arguably the most common and fundamental task in image analysis for political scientists. Equally important, we show that it is easy to implement as the entire command consists of a single prompt in natural language; it is fast to run and should meet the time budget of most researchers; and it is free to use and does not require any specialized hardware. In addition, we illustrate how p
The Internet and the Web are now an integral part of the way most modern societies, and corresponding political systems, work. We regard Political systems as the formal and informal political processes by which decisions are made concerning the use, production and distribution of resources in any given society. Our focus in on the sets of agents - Persons and Organizations - that govern a society, and their relations. We present a set of ontologies aimed at characterizing different kinds of direct and indirect relations that occur within a Political System. The goal is to provide a more semantically precise basis for determining more abstract notions such as "influence". These ontologies are being used for the "Se Liga na Politica" project, whose goal is to provide an open linked data database of Political Agents in Brazil. Whereas they are being used in a particular political system, these ontologies can be applied to different political systems.