Gerrymandering is one of the biggest threats to American democracy. By manipulating district lines, politicians effectively choose their voters rather than the other way around. Current gerrymandering identification methods (namely the Polsby-Popper and Reock scores) focus on the compactness of congressional districts, making them extremely sensitive to physical geography. To address this gap, we extend Feng and Porter's 2021 paper, which used the level-set method to turn geographic shapefiles into filtered simplicial complexes, in order to compare precinct level voting data to district level voting data. As precincts are regarded as too small to be gerrymandered, we are able to identify discrepancies between precinct and district level voting data to quantify gerrymandering in the United States. By comparing the persistent homologies of Democratic voting regions at the precinct and district levels, we detect when areas have been "cracked" (split across multiple districts) or "packed" (compressed into one district) for partisan gain. This analysis was conducted for North Carolina House of Representatives elections (2012-2024). North Carolina has been redistricted four times in the
This paper develops pavement performance evaluation models using data from primary and interstate highway systems in the state of South Carolina, USA. Twenty pavement sections are selected from across the state, and historical pavement performance data of those sections are collected. A total of 8 models were developed based on regression techniques, which include 4 for Asphalt Concrete (AC) pavements and 4 for Jointed Plain Concrete Pavements (JPCP). Four different performance indicators are considered as response variables in the statistical analysis: Present Serviceability Index (PSI), Pavement Distress Index (PDI), Pavement Quality Index (PQI), and International Roughness Index (IRI). Annual Average Daily Traffic (AADT), Free Flow Speed (FFS), precipitation, temperature, and soil type (soil Type A from Blue Ridge and Piedmont Region, and soil Type B from Coastal Plain and Sediment Region) are considered as predictor variables. Results showed that AADT, FFS, and precipitation have statistically significant effects on PSI and IRI for both JPCP and AC pavements. Temperature showed significant effect only on PDI and PQI (p < 0.01) for AC pavements. Considering soil type, Type B
This paper presents the first publicly available version of the Carolina Corpus and discusses its future directions. Carolina is a large open corpus of Brazilian Portuguese texts under construction using web-as-corpus methodology enhanced with provenance, typology, versioning, and text integrality. The corpus aims at being used both as a reliable source for research in Linguistics and as an important resource for Computer Science research on language models, contributing towards removing Portuguese from the set of low-resource languages. Here we present the construction of the corpus methodology, comparing it with other existing methodologies, as well as the corpus current state: Carolina's first public version has $653,322,577$ tokens, distributed over $7$ broad types. Each text is annotated with several different metadata categories in its header, which we developed using TEI annotation standards. We also present ongoing derivative works and invite NLP researchers to contribute with their own.
To help facilitate a variety of simulations related to healthcare facilities in North Carolina, we have developed an agent-based model (ABM) to accurately simulate patient (i.e., agent) movement to and from these facilities. This is an Overview, Design Concepts, and Details (ODD) Protocol, a standardized method for describing ABMs. This ODD provides detailed information on healthcare facilities in North Carolina, the agent movement to and between them, and any decisions that were made during the creation of this model. This ABM is intended to be used alongside disease-specific submodels. It can be used for purposes such as simulating the success of interventions on reducing disease transmission, simulating strain on facility resources (including staff and materials), and forecasting hospital capacity. Disease-specific ODDs should accompany this document. No details related to any submodels that use this ABM as a base model are included.
Background: To curb the opioid epidemic, legislation regulating the amount of opioid prescriptions covered by Medicaid (Title XIX of the Social Security Act Medical Assistance Program) came into effect in May 2018 in South Carolina. Methods: We employ a classification system based on distance and disparity between dispensers, prescribers, and patients and conduct an ARIMA analysis on each class and without any class to examine the effect of the legislation on opioid prescriptions, considering secular trends and autocorrelation. The study also compares trends in benzodiazepine prescriptions as a control. Results: The proposed classification system clusters each transaction into 16 groups based on the location of the stakeholders. These categories were found to have different prescription volume levels, with the highest group averaging 96.50 in daily MME (95% CI [63.43, 99.57]) and the lowest 37.78 (95% CI [37.38,38.18]). The ARIMA models show a decrease in overall prescription volume from 53.68 (95% CI [53.33,54.02]) to 51.09 (95% CI [50.74,51.44]) and varying impact across the different classes. Conclusion: Policy was effective in reducing opioid prescription volume overall. Howeve
On June 24th, Governor Cooper announced that North Carolina will not be moving into Phase 3 of its reopening process at least until July 17th. Given the recent increases in daily positive cases and hospitalizations, this decision was not surprising. However, given the political and economic pressures which are forcing the state to reopen, it is not clear what actions will help North Carolina to avoid the worst. We use a compartmentalized model to study the effects of social distancing measures and testing capacity combined with contact tracing on the evolution of the pandemic in North Carolina until the end of the year. We find that going back to restrictions that were in place during Phase 1 will slow down the spread but if the state wants to continue to reopen or at least remain in Phase 2 or Phase 3 it needs to significantly expand its testing and contact tracing capacity. Even under our best-case scenario of high contact tracing effectiveness, the number of contact tracers the state currently employs is inadequate.
The lottery is a very lucrative industry. Popular fascination often focuses on the largest prizes. However, less attention has been paid to detecting unusual lottery buying behaviors at lower stakes. Our paper introduces a new model to detect illegal discounting in the North Carolina Education Lottery using statistical analysis of net gains and ticket buying habits. Nine outlying players are flagged and are further examined using a proposed stochastic model to calculate the range of their possible losses in the lottery. The unusual buying patterns of the players flagged as outliers are further confirmed using a K-means clustering analysis of lottery store visiting behaviors.
Voter suppression and associated racial disparities in access to voting are long-standing civil rights concerns in the United States. Barriers to voting have taken many forms over the decades. A history of violent explicit discouragement has shifted to more subtle access limitations that can include long lines and wait times, long travel times to reach a polling station, and other logistical barriers to voting. Our focus in this work is on quantifying disparities in voting access pertaining to the overall time-to-vote, and how they could be remedied via a better choice of polling location or provisioning more sites where voters can cast ballots. However, appropriately calibrating access disparities is difficult because of the need to account for factors such as population density and different community expectations for reasonable travel times. In this paper, we quantify access to polling locations, developing a methodology for the calibrated measurement of racial disparities in polling location "load" and distance to polling locations. We apply this methodology to a study of real-world data from Florida and North Carolina to identify disparities in voting access from the 2020 elec
North Carolina's constitution requires that state legislative districts should not split counties. However, counties must be split to comply with the "one person, one vote" mandate of the U.S. Supreme Court. Given that counties must be split, the North Carolina legislature and courts have provided guidelines that seek to reduce counties split across districts while also complying with the "one person, one vote" criteria. Under these guidelines, the counties are separated into clusters. The primary goal of this work is to develop, present, and publicly release an algorithm to optimally cluster counties according to the guidelines set by the court in 2015. We use this tool to investigate the optimality and uniqueness of the enacted clusters under the 2017 redistricting process. We verify that the enacted clusters are optimal, but find other optimal choices. We emphasize that the tool we provide lists \textit{all} possible optimal county clusterings. We also explore the stability of clustering under changing statewide populations and project what the county clusters may look like in the next redistricting cycle beginning in 2020/2021.
During hurricane seasons, emergency managers and other decision makers need accurate and `on-time' information on potential storm surge impacts. Fully dynamical computer models, such as the ADCIRC tide, storm surge, and wind-wave model take several hours to complete a forecast when configured at high spatial resolution. Additionally, statically meaningful ensembles of high-resolution models (needed for uncertainty estimation) cannot easily be computed in near real-time. This paper discusses an artificial neural network model for storm surge prediction in North Carolina. The network model provides fast, real-time storm surge estimates at coastal locations in North Carolina. The paper studies the performance of the neural network model vs. other models on synthetic and real hurricane data.
We evaluate relative sea level (RSL) trajectories for North Carolina, USA, in the context of tide-gauge measurements and geological sea-level reconstructions spanning the last $\mathord{\sim}$11,000 years. RSL rise was fastest ($\mathord{\sim}$7 mm/yr) during the early Holocene and slowed over time with the end of the deglaciation. During the pre-Industrial Common Era (i.e., 0--1800 CE), RSL rise ($\mathord{\sim}$0.7 to 1.1 mm/yr) was driven primarily by glacio-isostatic adjustment, though dampened by tectonic uplift along the Cape Fear Arch. Ocean/atmosphere dynamics caused centennial variability of up to $\mathord{\sim}$0.6 mm/yr around the long-term rate. It is extremely likely (probability $P = 0.95$) that 20th century RSL rise at Sand Point, NC, (2.8 $\pm$ 0.5 mm/yr) was faster than during any other century in at least 2,900 years. Projections based on a fusion of process models, statistical models, expert elicitation, and expert assessment indicate that RSL at Wilmington, NC, is very likely ($P = 0.90$) to rise by 42--132 cm between 2000 and 2100 under the high-emissions RCP 8.5 pathway. Under all emission pathways, 21st century RSL rise is very likely ($P > 0.90$) to be f
Annual Average Daily Traffic (AADT) is an important parameter used in traffic engineering analysis. Departments of Transportation (DOTs) continually collect traffic count using both permanent count stations (i.e., Automatic Traffic Recorders or ATRs) and temporary short-term count stations. In South Carolina, 87% of the ATRs are located on interstates and arterial highways. For most secondary highways (i.e., collectors and local roads), AADT is estimated based on short-term counts. This paper develops AADT estimation models for different roadway functional classes with two machine learning techniques: Artificial Neural Network (ANN) and Support Vector Regression (SVR). The models aim to predict AADT from short-term counts. The results are first compared against each other to identify the best model. Then, the results of the best model are compared against a regression method and factor-based method. The comparison reveals the superiority of SVR for AADT estimation for different roadway functional classes over all other methods. Among all developed models for different functional roadway classes, the SVR-based model shows a minimum root mean square error (RMSE) of 0.22 and a mean ab
Coronavirus disease (COVID-19) pandemic has changed various aspects of people's lives and behaviors. At this stage, there are no other ways to control the natural progression of the disease than adopting mitigation strategies such as wearing masks, watching distance, and washing hands. Moreover, at this time of social distancing, social media plays a key role in connecting people and providing a platform for expressing their feelings. In this study, we tap into social media to surveil the uptake of mitigation and detection strategies, and capture issues and concerns about the pandemic. In particular, we explore the research question, "how much can be learned regarding the public uptake of mitigation strategies and concerns about COVID-19 pandemic by using natural language processing on Reddit posts?" After extracting COVID-related posts from the four largest subreddit communities of North Carolina over six months, we performed NLP-based preprocessing to clean the noisy data. We employed a custom Named-entity Recognition (NER) system and a Latent Dirichlet Allocation (LDA) method for topic modeling on a Reddit corpus. We observed that 'mask', 'flu', and 'testing' are the most preval
The Covid-19 pandemic has spread across the world since the beginning of 2020. Many regions have experienced its effects. The state of South Carolina in the USA has seen cases since early March 2020 and a primary peak in early April 2020. A lockdown was imposed on April 6th but lifting of restrictions started on April 24th. The daily case and death data as reported by NCHS (deaths) via the New York Times GitHUB repository have been analyzed and approaches to modeling of the data are presented. Prediction is also considered and the role of asymptomatic transmission is assessed as a latent unobserved effect. Two different time periods are examined and one step prediction is provided.
Rigorous model-based analysis can help inform state-level energy and climate policy. In this study, we utilize an open-source energy system optimization model and publicly available datasets to examine future electricity generation, CO2 emissions, and CO2 abatement costs for the North Carolina electric power sector through 2050. Model scenarios include uncertainty in future fuel prices, a hypothetical CO2 cap, and an extended renewable portfolio standard. Across the modeled scenarios, solar photovoltaics represent the most cost-effective low-carbon technology, while trade-offs among carbon constrained scenarios largely involve natural gas and renewables. We also develop a new method to calculate break-even costs, which indicate the capital costs at which different technologies become cost-effective within the model. Significant variation in break-even costs are observed across different technologies and scenarios. We illustrate how break-even costs can be used to inform the development of an extended renewable portfolio standard in North Carolina. Utilizing the break-even costs to calibrate a tax credit for onshore wind, we find that the resultant wind deployment displaces other re
Flooding due to Hurricane Florence led to billions of dollars in damage and nearly a hundred deaths in North Carolina. These damages and fatalities can be avoided with proper prevention and preparation. Modelling such flooding events can provide insight and precaution based on principles of fluid dynamics and GIS technology. Using topography and other geographic data from USGS, HEC-RAS can solve the Shallow Water Equations over flooding areas to assist the study of inundation patterns. Simulation results from HEC-RAS agree with observations from NOAA in the flooding area studied. Modeled results from specific locations affected by Hurricane Florence near Neuse River, NC are compared with observations. While overall pattern of inundation is agreeable between model results and observations, there are also differences at very specific locations. Higher resolution topography data and precipitation data over the flooding area may improve the simulation result and reduce the differences.
The traditional formulation of string amplitudes via worldsheet integrals provides a parametrization of the moduli space that fails to expose the complete singularity structure of the amplitudes. This problem is solved by the positive parametrization of string amplitudes given by surfaceology. In this work, we use this formalism to study a number of properties of string amplitudes at tree-level and one-loop. We introduce several global prescriptions for an integration contour for which the integrals are finite everywhere in kinematic space. At tree-level, this is done in two ways: one directly implements the Feynman $i\varepsilon$ to analytically continue from Euclidean to Lorentzian worldsheets; the other is a generalization of the closed Pochhammer contour to arbitrary number of points. At loop-level, we present a systematic way of extracting cuts directly from the worldsheet integrand. This provides a powerful set of unitarity constraints, which we use to test the consistency of different "stringy" UV regularizations of field theory amplitudes. In addition, we identify the massive threshold expansion of the integrand, which allows us to reduce the problem to a finite set of Feyn
In this paper, we discuss the stable discretisation of the double layer boundary integral operator for the wave equation in $1d$. For this, we show that the boundary integral formulation is $L^2$-elliptic and also inf-sup stable in standard energy spaces. This turns out to be a particular case of a recent result on the inf-sup stability of boundary integral operators for the wave equation and contributes to its further understanding. Moreover, we present the first BEM discretisations of second-kind operators for the wave equation for which stability is guaranteed and a complete numerical analysis is offered. We validate our theoretical findings with numerical experiments.
We study the algebraic monodromy of families of cyclic Galois coverings of curves. Under a condition on the $G$-decomposition of the associated variation of Hodge structures, we prove a criterion for the maximality of the monodromy. The proof combines the genus-zero case with a degeneration argument involving Prym varieties of certain admissible coverings. As a consequence of our criterion, we show that for $g\geq 8$ there exists no special family of Galois covers of the type we consider, providing new evidence towards the Coleman-Oort conjecture. Finally, we determine when the loci of double and triple Galois covers are totally geodesic.
A surprising connection has recently been made between the amplitudes for Tr($Φ^3$) theory and the non-linear sigma model (NLSM). A simple shift of kinematic variables naturally suggested by the associahedron/stringy representation of Tr$(Φ^3$) theory yields pion amplitudes at all loops. In this note we provide an elementary motivation and proof for this link going in the opposite direction, starting from the non-linear sigma model and discovering its formulation as a sum over triangulations of surfaces with simple numerator factors. This uses an ancient connection between "circles" and "triangles", interpreting the equation $y = \sqrt{1 - x^2}$ both as parametrizing points on a circle as well as generating the number of triangulations of polygons. A further simplification of the numerator factors exposes them as arising from the kinematically shifted Tr($Φ^3$) theory, and gives rise to novel tropical representations of NLSM amplitudes. The connection to Tr$(Φ^3)$ theory defines a natural notion of "surface-soft limit" intrinsic to curves on surfaces. Remarkably, with this definition, the soft limit of pion amplitudes vanishes directly at the level of the integrand, via obvious pai