The community plays a crucial role in understanding user behavior and network characteristics in social networks. Some users can use multiple social networks at once for a variety of objectives. These users are called overlapping users who bridge different social networks. Detecting communities across multiple social networks is vital for interaction mining, information diffusion, and behavior migration analysis among networks. This paper presents a community detection method based on nonnegative matrix tri-factorization for multiple heterogeneous social networks, which formulates a common consensus matrix to represent the global fused community. Specifically, the proposed method involves creating adjacency matrices based on network structure and content similarity, followed by alignment matrices which distinguish overlapping users in different social networks. With the generated alignment matrices, the method could enhance the fusion degree of the global community by detecting overlapping user communities across networks. The effectiveness of the proposed method is evaluated with new metrics on Twitter, Instagram, and Tumblr datasets. The results of the experiments demonstrate its
Sociotechnical research increasingly includes the social sub-networks that emerge from large-scale sociotechnical infrastructure, including the infrastructure for building open source software. This paper addresses these numerous sub-networks as advantageous for researchers. It provides a methodological synthesis focusing on how researchers can best span adjacent social sub-networks during engaged field research. Specifically, we describe practices and artifacts that aid movement from one social subsystem within a more extensive technical infrastructure to another. To surface the importance of spanning sub-networks, we incorporate a discussion of social capital and the role of technical infrastructure in its development for sociotechnical researchers. We then characterize a five-step process for spanning social sub-networks during engaged field research: commitment, context mapping, jargon competence, returning value, and bridging. We then present our experience studying corporate open source software projects and the role of that experience in accelerating our work in open source scientific software research as described through the lens of bridging social capital. Based on our an
Social media platforms have been establishing content moderation guidelines and employing various moderation policies to counter hate speech and misinformation. The goal of this paper is to study these community guidelines and moderation practices, as well as the relevant research publications, to identify the research gaps, differences in moderation techniques, and challenges that should be tackled by the social media platforms and the research community. To this end, we study and analyze fourteen most popular social media content moderation guidelines and practices, and consolidate them. We then introduce three taxonomies drawn from this analysis as well as covering over two hundred interdisciplinary research papers about moderation strategies. We identify the differences between the content moderation employed in mainstream and fringe social media platforms. Finally, we have in-depth applied discussions on both research and practical challenges and solutions.
Although beneficial information abounds on social media, the dissemination of harmful information such as so-called ``fake news'' has become a serious issue. Therefore, many researchers have devoted considerable effort to limiting the diffusion of harmful information. A promising approach to limiting diffusion of such information is link deletion methods in social networks. Link deletion methods have been shown to be effective in reducing the size of information diffusion cascades generated by synthetic models on a given social network. In this study, we evaluate the effectiveness of link deletion methods by using actual logs of retweet cascades, rather than by using synthetic diffusion models. Our results show that even after deleting 10\%--50\% of links from a social network, the size of cascades after link deletion is estimated to be only 50\% the original size under the optimistic estimation, which suggests that the effectiveness of the link deletion strategy for suppressing information diffusion is limited. Moreover, our results also show that there is a considerable number of cascades with many seed users, which renders link deletion methods inefficient.
The rise of social media has fundamentally transformed how people engage in public discourse and form opinions. While these platforms offer unprecedented opportunities for democratic engagement, they have been implicated in increasing social polarization and the formation of ideological echo chambers. Previous research has primarily relied on observational studies of social media data or theoretical modeling approaches, leaving a significant gap in our understanding of how individuals respond to and are influenced by polarized online environments. Here we present a novel experimental framework for investigating polarization dynamics that allows human users to interact with LLM-based artificial agents in a controlled social network simulation. Through a user study with 122 participants, we demonstrate that this approach can successfully reproduce key characteristics of polarized online discourse while enabling precise manipulation of environmental factors. Our results provide empirical validation of theoretical predictions about online polarization, showing that polarized environments significantly increase perceived emotionality and group identity salience while reducing expressed
Texts reveal the subjects of interest in research fields, and the values, beliefs, and practices of researchers. In this study, texts are examined through bibliometric mapping and topic modeling to provide a birds eye view of the social dynamics associated with the diffusion of research synthesis methods in the contexts of Social Work and Women's Studies. Research synthesis texts are especially revealing because the methods, which include meta-analysis and systematic review, are reliant on the availability of past research and data, sometimes idealized as objective, egalitarian approaches to research evaluation, fundamentally tied to past research practices, and performed with the goal informing future research and practice. This study highlights the co-influence of past and subsequent research within research fields; illustrates dynamics of the diffusion process; and provides insight into the cultural contexts of research in Social Work and Women's Studies. This study suggests the potential to further develop bibliometric mapping and topic modeling techniques to inform research problem selection and resource allocation.
Decentralized Online Social Networks (DOSNs) represent a growing trend in the social media landscape, as opposed to the well-known centralized peers, which are often in the spotlight due to privacy concerns and a vision typically focused on monetization through user relationships. By exploiting open-source software, DOSNs allow users to create their own servers, or instances, thus favoring the proliferation of platforms that are independent yet interconnected with each other in a transparent way. Nonetheless, the resulting cooperation model, commonly known as the Fediverse, still represents a world to be fully discovered, since existing studies have mainly focused on a limited number of structural aspects of interest in DOSNs. In this work, we aim to fill a lack of study on user relations and roles in DOSNs, by taking two main actions: understanding the impact of decentralization on how users relate to each other within their membership instance and/or across different instances, and unveiling user roles that can explain two interrelated axes of social behavioral phenomena, namely information consumption and boundary spanning. To this purpose, we build our analysis on user networks
Social network alignment has been an important research problem for social network analysis in recent years. With the identified shared users across networks, it will provide researchers with the opportunity to achieve a more comprehensive understanding of users' social activities both within and across networks. Social network alignment is a very difficult problem. Besides the challenges introduced by the network heterogeneity, the network alignment problem can be reduced to a combinatorial optimization problem with an extremely large search space. The learning effectiveness and efficiency of existing alignment models will be degraded significantly as the network size increases. In this paper, we will focus on studying the scalable heterogeneous social network alignment problem, and propose to address it with a novel two-stage network alignment model, namely \textbf{S}calable \textbf{H}eterogeneous \textbf{N}etwork \textbf{A}lignment (SHNA). Based on a group of intra- and inter-network meta diagrams, SHNA first partitions the social networks into a group of sub-networks synergistically. Via the partially known anchor links, SHNA will extract the partitioned sub-network corresponde
All online sharing systems gather data that reflects users' collective behaviour and their shared activities. This data can be used to extract different kinds of relationships, which can be grouped into layers, and which are basic components of the multidimensional social network proposed in the paper. The layers are created on the basis of two types of relations between humans, i.e. direct and object-based ones which respectively correspond to either social or semantic links between individuals. For better understanding of the complexity of the social network structure, layers and their profiles were identified and studied on two, spanned in time, snapshots of the Flickr population. Additionally, for each layer, a separate strength measure was proposed. The experiments on the Flickr photo sharing system revealed that the relationships between users result either from semantic links between objects they operate on or from social connections of these users. Moreover, the density of the social network increases in time. The second part of the study is devoted to building a social recommender system that supports the creation of new relations between users in a multimedia sharing syst
Online social networking sites such as Facebook, Twitter and Flickr are among the most popular sites on the Web, providing platforms for sharing information and interacting with a large number of people. The different ways for users to interact, such as liking, retweeting and favoriting user-generated content, are among the defining and extremely popular features of these sites. While empirical studies have been done to learn about the network growth processes in these sites, few studies have focused on social interaction behaviour and the effect of social interaction on network growth. In this paper, we analyze large-scale data collected from the Flickr social network to learn about individual favoriting behaviour and examine the occurrence of link formation after a favorite is created. We do this using a systematic formulation of Flickr as a two-layer temporal multiplex network: the first layer describes the follow relationship between users and the second layer describes the social interaction between users in the form of favorite markings to photos uploaded by them. Our investigation reveals that (a) favoriting is well-described by preferential attachment, (b) over 50% of favor
Online professional social networking platforms provide opportunities to expand networks strategically for job opportunities and career advancement. A large body of research shows that women's offline networks are less advantageous than men's. How online platforms such as LinkedIn may reflect or reproduce gendered networking behaviours, or how online social connectivity may affect outcomes differentially by gender is not well understood. This paper analyses aggregate, anonymised data from almost 10 million LinkedIn users in the UK and US information technology (IT) sector collected from the site's advertising platform to explore how being connected to Big Tech companies ('social connectivity') varies by gender, and how gender, age, seniority and social connectivity shape the propensity to report job promotions or relocations. Consistent with previous studies, we find there are fewer women compared to men on LinkedIn in IT. Furthermore, female users are less likely to be connected to Big Tech companies than men. However, when we further analyse recent promotion or relocation reports, we find women are more likely than men to have reported a recent promotion at work, suggesting high-
Social recommendation, which seeks to leverage social ties among users to alleviate the sparsity issue of user-item interactions, has emerged as a popular technique for elevating personalized services in recommender systems. Despite being effective, existing social recommendation models are mainly devised for recommending regular items such as blogs, images, and products, and largely fail for community recommendations due to overlooking the unique characteristics of communities. Distinctly, communities are constituted by individuals, who present high dynamicity and relate to rich structural patterns in social networks. To our knowledge, limited research has been devoted to comprehensively exploiting this information for recommending communities. To bridge this gap, this paper presents CASO, a novel and effective model specially designed for social community recommendation. Under the hood, CASO harnesses three carefully-crafted encoders for user embedding, wherein two of them extract community-related global and local structures from the social network via social modularity maximization and social closeness aggregation, while the third one captures user preferences using collaborati
A social network confers benefits and advantages on individuals (and on groups), the literature refers to these advantages as social capital. This paper presents a micro-founded mathematical model of the evolution of a social network and of the social capital of individuals within the network. The evolution of the network is influenced by the extent to which individuals are homophilic, structurally opportunistic, socially gregarious and by the distribution of types in the society. In the analysis, we identify different kinds of social capital: bonding capital, popularity capital, and bridging capital. Bonding capital is created by forming a circle of connections, homophily increases bonding capital because it makes this circle of connections more homogeneous. Popularity capital leads to preferential attachment: individuals who become popular tend to become more popular because others are more likely to link to them. Homophily creates asymmetries in the levels of popularity attained by different social groups, more gregarious types of agents are more likely to become popular. However, in homophilic societies, individuals who belong to less gregarious, less opportunistic, or major ty
In 2016, a network of social media accounts animated by Russian operatives attempted to divert political discourse within the American public around the presidential elections. This was a coordinated effort, part of a Russian-led complex information operation. Utilizing the anonymity and outreach of social media platforms Russian operatives created an online astroturf that is in direct contact with regular Americans, promoting Russian agenda and goals. The elusiveness of this type of adversarial approach rendered security agencies helpless, stressing the unique challenges this type of intervention presents. Building on existing scholarship on the functions within influence networks on social media, we suggest a new approach to map those types of operations. We argue that pretending to be legitimate social actors obliges the network to adhere to social expectations, leaving a social footprint. To test the robustness of this social footprint we train artificial intelligence to identify it and create a predictive model. We use Twitter data identified as part of the Russian influence network for training the artificial intelligence and to test the prediction. Our model attains 88% pred
Politicization is a social phenomenon studied by political science characterized by the extent to which ideas and facts are given a political tone. A range of topics, such as climate change, religion and vaccines has been subject to increasing politicization in the media and social media platforms. In this work, we propose a computational method for assessing politicization in online conversations based on topic shifts, i.e., the degree to which people switch topics in online conversations. The intuition is that topic shifts from a non-political topic to politics are a direct measure of politicization -- making something political, and that the more people switch conversations to politics, the more they perceive politics as playing a vital role in their daily lives. A fundamental challenge that must be addressed when one studies politicization in social media is that, a priori, any topic may be politicized. Hence, any keyword-based method or even machine learning approaches that rely on topic labels to classify topics are expensive to run and potentially ineffective. Instead, we learn from a seed of political keywords and use Positive-Unlabeled (PU) Learning to detect political com
How does one find important or influential people in an online social network? Researchers have proposed a variety of centrality measures to identify individuals that are, for example, often visited by a random walk, infected in an epidemic, or receive many messages from friends. Recent research suggests that a social media users' capacity to respond to an incoming message is constrained by their finite attention, which they divide over all incoming information, i.e., information sent by users they follow. We propose a new measure of centrality --- limited-attention version of Bonacich's Alpha-centrality --- that models the effect of limited attention on epidemic diffusion. The new measure describes a process in which nodes broadcast messages to their out-neighbors, but the neighbors' ability to receive the message depends on the number of in-neighbors they have. We evaluate the proposed measure on real-world online social networks and show that it can better reproduce an empirical influence ranking of users than other popular centrality measures.
To study the effects of Online Social Network (OSN) activity on real-world offline events, researchers need access to OSN data, the reliability of which has particular implications for social network analysis. This relates not only to the completeness of any collected dataset, but also to constructing meaningful social and information networks from them. In this multidisciplinary study, we consider the question of constructing traditional social networks from OSN data and then present a measurement case study showing how the reliability of OSN data affects social network analyses. To this end we developed a systematic comparison methodology, which we applied to two parallel datasets we collected from Twitter. We found considerable differences in datasets collected with different tools and that these variations significantly alter the results of subsequent analyses. Our results lead to a set of guidelines for researchers planning to collect online data streams to infer social networks.
The year 2020 will be remembered for two events of global significance: the COVID-19 pandemic and 2020 U.S. Presidential Election. In this chapter, we summarize recent studies using large public Twitter data sets on these issues. We have three primary objectives. First, we delineate epistemological and practical considerations when combining the traditions of computational research and social science research. A sensible balance should be struck when the stakes are high between advancing social theory and concrete, timely reporting of ongoing events. We additionally comment on the computational challenges of gleaning insight from large amounts of social media data. Second, we characterize the role of social bots in social media manipulation around the discourse on the COVID-19 pandemic and 2020 U.S. Presidential Election. Third, we compare results from 2020 to prior years to note that, although bot accounts still contribute to the emergence of echo-chambers, there is a transition from state-sponsored campaigns to domestically emergent sources of distortion. Furthermore, issues of public health can be confounded by political orientation, especially from localized communities of acto
Environmental Social Governance (ESG) is a widely used metric that measures the sustainability of a company practices. Currently, ESG is determined using self-reported corporate filings, which allows companies to portray themselves in an artificially positive light. As a result, ESG evaluation is subjective and inconsistent across raters, giving executives mixed signals on what to improve. This project aims to create a data-driven ESG evaluation system that can provide better guidance and more systemized scores by incorporating social sentiment. Social sentiment allows for more balanced perspectives which directly highlight public opinion, helping companies create more focused and impactful initiatives. To build this, Python web scrapers were developed to collect data from Wikipedia, Twitter, LinkedIn, and Google News for the S&P 500 companies. Data was then cleaned and passed through NLP algorithms to obtain sentiment scores for ESG subcategories. Using these features, machine-learning algorithms were trained and calibrated to S&P Global ESG Ratings to test their predictive capabilities. The Random-Forest model was the strongest model with a mean absolute error of 13.4% an
It is quite obvious that in the real world, more than one kind of relationship can exist between two actors and that those ties can be so intertwined that it is impossible to analyse them separately [Fienberg 85], [Minor 83], [Szell 10]. Social networks with more than one type of relation are not a completely new concept [Wasserman 94] but they were analysed mainly at the small scale, e.g. in [McPherson 01], [Padgett 93], and [Entwisle 07]. Just like in the case of regular single-layered social network there is no widely accepted definition or even common name. At the beginning such networks have been called multiplex network [Haythornthwaite 99], [Monge 03]. The term is derived from communications theory which defines multiplex as combining multiple signals into one in such way that it is possible to separate them if needed [Hamill 06]. Recently, the area of multi-layered social network has started attracting more and more attention in research conducted within different domains [Kazienko 11a], [Szell 10], [Rodriguez 07], [Rodriguez 09], and the meaning of multiplex network has expanded and covers not only social relationships but any kind of connection, e.g. based on geography, o