Sampling geographically dispersed minority populations poses substantial challenges when individual group membership cannot be directly observed. Although stratified sampling can offer efficiency gains, these gains are typically modest unless the minority population is highly concentrated within a small number of strata. In this paper, we propose using Bayesian Improved Surname Geocoding (BISG) to enhance the efficiency of minority population sampling. BISG generates individual-level probabilities of minority group membership based on names and residential addresses. We incorporate these probabilities into a stratified Poisson probability sampling design. Applying the proposed approach to a national survey of Jewish Americans, we find that our estimates closely align with those from a large-scale Pew Research Center survey of the same population, which relied on a substantially more expensive sampling strategy involving geographic stratification and screening. At a fraction of the cost, our survey reproduces nearly identical patterns observed by Pew, including estimates of religious denominations and participation in specific religious activities.
Depression affects more than 280 million people worldwide, with poorer communities having disproportionate burden as well as barriers to treatment. This study examines the role of pharmacy pricing caps in access to antidepressants among poorer Americans through bibliometric analysis of the 100 most cited articles on antidepressant pricing and access in the Web of Science Core Collection. We used tools like Bibliometrix and VOSviewer to visualize publication trends, dominant contributors, thematic clusters, and citation networks in the literature. Findings highlight intransigent inequalities in access to antidepressants based on astronomically high drug pricing as well as systemic inequalities against racial and ethnic minorities in particular. Branded antidepressant high prices are associated with low initiation of therapy as well as regimen compliance, heightened mental illness outcomes, as well as increased health utilization. This work uncovers critical gaps in the literature and demands immediate policy action to make antidepressants affordable as well as appropriately accessible to marginalized communities.
Despite increasing AI chatbot deployment in public discourse, empirical evidence on their capacity to foster intercultural empathy remains limited. Through a randomized experiment, we assessed how different AI deliberation approaches--cross-cultural deliberation (presenting other-culture perspectives), own-culture deliberation (representing participants' own culture), and non-deliberative control--affect intercultural empathy across American and Latin American participants. Cross-cultural deliberation increased intercultural empathy among American participants through positive emotional engagement, but produced no such effects for Latin American participants, who perceived AI responses as culturally inauthentic despite explicit prompting to represent their cultural perspectives. Our analysis of participant-driven feedback, where users directly flagged and explained culturally inappropriate AI responses, revealed systematic gaps in AI's representation of Latin American contexts that persist despite sophisticated prompt engineering. These findings demonstrate that current approaches to AI cultural alignment--including linguistic adaptation and explicit cultural prompting--cannot full
COVID-19 has aided the spread of racism, as well as national insecurity, distrust of immigrants, and general xenophobia, both of which may be linked to the rise in anti-Asian hate crimes during the pandemic. Coronavirus Disease 2019(COVID19) is thought to have originated in late December 2019 in Wuhan, China, and quickly spread across the world during the spring months of 2020. Asian Americans recorded in increase in racially based hate crimes including physical abuse and intimidation as COVID-19 spread throughout the United States. This research study was conducted by high school students in the Bay Area to compare the intention and characteristics of hate crimes against Asian Americans to hate crimes against African Americans. According to studies of both victim-related and most offender-related variables, hate crimes against Asian Americans have been rapidly growing in the United States and vary from those against African Americans. This leads to an investigation into the racial disparity between Asian American offenders and those of other races. The nature and characteristics of hate crimes against Asian Americans are compared to those of hate crimes against African Americans i
Efficient coverage for newly developed vaccines requires knowing which groups of individuals will accept the vaccine immediately and which will take longer to accept or never accept. Of those who may eventually accept the vaccine, there are two main types: success-based learners, basing their decisions on others' satisfaction, and myopic rationalists, attending to their own immediate perceived benefit. We used COVID-19 vaccination data to fit a mechanistic model capturing the distinct effects of the two types on the vaccination progress. We estimated that 47 percent of Americans behaved as myopic rationalist with a high variations across the jurisdictions, from 31 percent in Mississippi to 76 percent in Vermont. The proportion was correlated with the vaccination coverage, proportion of votes in favor of Democrats in 2020 presidential election, and education score.
We scanned through the genomes of 29,141 African Americans, searching for loci where the average proportion of African ancestry deviates significantly from the genome-wide average. We failed to find any genome-wide significant deviations, and conclude that any selection in African Americans since admixture is sufficiently weak that it falls below the threshold of our power to detect it using a large sample size. These results stand in contrast to the findings of a recent study of selection in African Americans. That study, which had 15 times fewer samples, reported six loci with significant deviations. We show that the discrepancy is likely due to insufficient correction for multiple hypothesis testing in the previous study. The same study reported 14 loci that showed greater population differentiation between African Americans and Nigerian Yoruba than would be expected in the absence of natural selection. Four such loci were previously shown to be genome-wide significant and likely to be affected by selection, but we show that most of the 10 additional loci are likely to be false positives. Additionally, the most parsimonious explanation for the loci that have significant evidence
News about massive online breaches is increasingly common. But there has been little good data on how exposed people are because of these breaches. We combine data from a large, representative sample of adult Americans (n = 5,000) with data from \textit{Have I Been Pwned} to estimate the lower bound of the average number of breached online accounts per person. We find that at least 82.84% of Americans have had their accounts breached. And that on average Americans' accounts have been breached at least three times. Better educated, the middle-aged, women, and Whites are more likely to have had their accounts breached than the complementary groups.
Socio-economic disparities quite often have a central role in the unfolding of large-scale catastrophic events. One of the most concerning aspects of the ongoing COVID-19 pandemics is that it disproportionately affects people from Black and African American backgrounds creating an unexpected infection gap. Interestingly, the abnormal impact on these ethnic groups seem to be almost uncorrelated with other risk factors, including co-morbidity, poverty, level of education, access to healthcare, residential segregation, and response to cures. A proposed explanation for the observed incidence gap is that people from African American backgrounds are more often employed in low-income service jobs, and are thus more exposed to infection through face-to-face contacts, but the lack of direct data has not allowed to draw strong conclusions in this sense so far. Here we introduce the concept of dynamic segregation, that is the extent to which a given group of people is internally clustered or exposed to other groups, as a result of mobility and commuting habits. By analysing census and mobility data on more than 120 major US cities, we found that the dynamic segregation of African American com
Systemic property dispossession from minority groups has often been carried out in the name of technological progress. In this paper, we identify evidence that the current paradigm of large language models (LLMs) likely continues this long history. Examining common LLM training datasets, we find that a disproportionate amount of content authored by Jewish Americans is used for training without their consent. The degree of over-representation ranges from around 2x to around 6.5x. Given that LLMs may substitute for the paid labor of those who produced their training data, they have the potential to cause even more substantial and disproportionate economic harm to Jewish Americans in the coming years. This paper focuses on Jewish Americans as a case study, but it is probable that other minority communities (e.g., Asian Americans, Hindu Americans) may be similarly affected and, most importantly, the results should likely be interpreted as a "canary in the coal mine" that highlights deep structural concerns about the current LLM paradigm whose harms could soon affect nearly everyone. We discuss the implications of these results for the policymakers thinking about how to regulate LLMs as
How do authoritarian regimes strengthen global support for nondemocratic political systems? Roughly half of the users of the social media platform TikTok report getting news from social media influencers. Against this backdrop, authoritarian regimes have increasingly outsourced content creation to these influencers. To gain understanding of the extent of this phenomenon and the persuasive capabilities of these influencers, we collect comprehensive data on pro-China influencers on TikTok. We show that pro-China influencers have more engagement than state media. We then create a realistic clone of the TikTok app, and conduct a randomized experiment in which over 8,500 Americans are recruited to use this app and view a random sample of actual TikTok content. We show that pro-China foreign influencers are strikingly effective at increasing favorability toward China, while traditional Chinese state media causes backlash. The findings highlight the importance of influencers in shaping global public opinion.
In AI, most evaluations of natural language understanding tasks are conducted in standardized dialects such as Standard American English (SAE). In this work, we investigate how accurately large language models (LLMs) represent African American Vernacular English (AAVE). We analyze three LLMs to compare their usage of AAVE to the usage of humans who natively speak AAVE. We first analyzed interviews from the Corpus of Regional African American Language and TwitterAAE to identify the typical contexts where people use AAVE grammatical features such as ain't. We then prompted the LLMs to produce text in AAVE and compared the model-generated text to human usage patterns. We find that, in many cases, there are substantial differences between AAVE usage in LLMs and humans: LLMs usually underuse and misuse grammatical features characteristic of AAVE. Furthermore, through sentiment analysis and manual inspection, we found that the models replicated stereotypes about African Americans. These results highlight the need for more diversity in training data and the incorporation of fairness methods to mitigate the perpetuation of stereotypes.
This paper studies the regularity of finite-maturity American value functions in the Heston model. Although the Heston operator is degenerate when the volatility is zero, we are able to establish C^{1,2} regularity of the American value functions in the exercise domain and the smooth-fit principle, using PDE techniques.
We develop a practical framework for identifying and quantifying the hidden layers of risks and optionality embedded in American options by introducing stochasticity into one or more of their underlying determinants. The heuristic approach remedies the problems of conventional pricing systems, which treat some key inputs deterministically, hence systematically underestimate the flexibility and convexity inherent in early-exercise features.
Federal agencies and researchers increasingly use large language models to analyze and simulate public opinion. When AI mediates between the public and policymakers, accuracy across intersecting identities becomes consequential; inaccurate group-level estimates may mislead outreach, consultation, and policy design. While research examines intersectionality in LLM outputs, few studies have compared these outputs against real human responses across intersecting identities. Climate policy is one such domain, and this is particularly urgent for climate change, where opinion is contested and diverse. We investigate how LLMs represent demographic and intersectional patterns in U.S. climate opinions. We prompted six LLMs with profiles of 978 respondents from a nationally representative U.S. climate opinion survey and compared AI-generated responses to actual human answers across 20 questions. We find that LLMs appear to compress the diversity of American climate opinions, predicting less-concerned groups as more concerned and vice versa. This compression is intersectional: LLMs appear to apply uniform gender assumptions that match reality for White and Hispanic Americans but may misrepres
Artificial intelligence (AI) systems often reflect biases from economically advanced regions, marginalizing contexts in economically developing regions like Latin America due to imbalanced datasets. This paper examines AI representations of diverse Latin American contexts, revealing disparities between data from economically advanced and developing regions. We highlight how the dominance of English over Spanish, Portuguese, and indigenous languages such as Quechua and Nahuatl perpetuates biases, framing Latin American perspectives through a Western lens. To address this, we introduce a culturally aware dataset rooted in Latin American history and socio-political contexts, challenging Eurocentric models. We evaluate six language models on questions testing cultural context awareness, using a novel Cultural Expressiveness metric, statistical tests, and linguistic analyses. Our findings show that some models better capture Latin American perspectives, while others exhibit significant sentiment misalignment (p < 0.001). Fine-tuning Mistral-7B with our dataset improves its cultural expressiveness by 42.9%, advancing equitable AI development. We advocate for equitable AI by prioritizi
We study the obstacle problem associated with the American chooser option. The obstacle is given by the maximum of an American call option and an American put option, which, in turn, can be expressed as the maximum of the solutions to the corresponding obstacle problems. This structure makes the obstacle problem particularly challenging and non-trivial. Using theoretical analysis, we overcome these difficulties and establish the existence and uniqueness of a strong solution. Furthermore, we rigorously prove the monotonicity and smoothness of the free boundary arising from the obstacle problem.
An accurate valuation of American call options is critical in most financial decision making environments. However, traditional models like the Barone-Adesi Whaley (B-AW) and Binomial Option Pricing (BOP) methods fall short in handling the complexities of early exercise and market dynamics present in American options. This paper proposes a Modular Neural Network (MNN) model which aims to capture the key aspects of American options pricing. By dividing the prediction process into specialized modules, the MNN effectively models the non-linear interactions that drive American call options pricing. Experimental results indicate that the MNN model outperform both traditional models as well as a simpler Feed-forward Neural Network (FNN) across multiple stocks (AAPL, NVDA, QQQ), with significantly lower RMSE and nRMSE (by mean). These findings highlight the potential of MNNs as a powerful tool to improve the accuracy of predicting option prices.
Representations of AI agents in user interfaces and robotics are predominantly White, not only in terms of facial and skin features, but also in the synthetic voices they use. In this paper we explore some unexpected challenges in the representation of race we found in the process of developing an U.S. English Text-to-Speech (TTS) system aimed to sound like an educated, professional, regional accent-free African American woman. The paper starts by presenting the results of focus groups with African American IT professionals where guidelines and challenges for the creation of a representative and appropriate TTS system were discussed and gathered, followed by a discussion about some of the technical difficulties faced by the TTS system developers. We then describe two studies with U.S. English speakers where the participants were not able to attribute the correct race to the African American TTS voice while overwhelmingly correctly recognizing the race of a White TTS system of similar quality. A focus group with African American IT workers not only confirmed the representativeness of the African American voice we built, but also suggested that the surprising recognition results may
We consider the problem of pricing American Exchange options driven by a Lévy process. We study the properties of American Exchange options, we represented it as the sum of the price of the corresponding European exchange option price and an early exercise premium. Secondly, we show some properties of the free boundary and give an approximative formula of an American Exchange option.
We conducted a global comparative analysis of the coverage of American topics in different language versions of Wikipedia, using over 90 million Wikidata items and 40 million Wikipedia articles in 58 languages. Our study aimed to investigate whether Americanization is more or less dominant in different regions and cultures and to determine whether interest in American topics is universal.