This study presents a mathematical optimization framework and analysis to inform practical long-term investment planning in Puerto Rico's electric power system. We utilize a high-resolution capacity expansion planning model to identify least-cost generation and storage investments that improve reliability. The model co-optimizes new investments with thermal retirements and includes detailed dispatch, unit commitment, fuel selection, storage operation, engineering limits, system constraints, fuel supply limits, and load balance. Key advances over prior studies on Puerto Rico's system include: (i) Nodal transmission representation at 38 kV and above; (ii) hourly chronological simulation for representative days; (iii) explicit unit commitment for existing and new thermal units with realistic ramping, minimum up and down times, and startup costs; (iv) system-wide fuel supply constraints; and (v) operational scenarios reflecting load variability, renewable availability, and high forced outage rates in legacy units. Using data from LUMA, the Puerto Rico Electric Power Authority (PREPA), U.S. Department of Energy, and public sources, the study builds representative Puerto Rico systems for
Citation network analysis has become one of methods to study how scientific knowledge flows from one domain to another. Health informatics is a multidisciplinary field that includes social science, software engineering, behavioral science, medical science and others. In this study, we perform an analysis of citation statistics from health informatics journals using data set extracted from CrossRef. For each health informatics journal, we extract the number of citations from/to studies related to computer science, medicine/clinical medicine and other fields, including the number of self-citations from the health informatics journal. With a similar number of articles used in our analysis, we show that the Journal of the American Medical Informatics Association (JAMIA) has more in-citations than the Journal of Medical Internet Research (JMIR); while JMIR has a higher number of out-citations and self-citations. We also show that JMIR cites more articles from health informatics journals and medicine related journals. In addition, the Journal of Medical Systems (JMS) cites more articles from computer science journals compared with other health informatics journals included in our analysi
Past research has shown the benefits of food journaling in promoting mindful eating and healthier food choices. However, the links between journaling and healthy eating have not been thoroughly examined. Beyond caloric restriction, do journalers consistently and sufficiently consume healthful diets? How different are their eating habits compared to those of average consumers who tend to be less conscious about health? In this study, we analyze the healthy eating behaviors of active food journalers using data from MyFitnessPal. Surprisingly, our findings show that food journalers do not eat as healthily as they should despite their proclivity to health eating and their food choices resemble those of the general populace. Furthermore, we find that the journaling duration is only a marginal determinant of healthy eating outcomes and sociodemographic factors, such as gender and regions of residence, are much more predictive of healthy food choices.
There is limited understanding of how dietary behaviors cluster together and influence cardiometabolic health at a population level in Puerto Rico. Data availability is scarce, particularly outside of urban areas, and is often limited to non-probability sample (NPS) data where sample inclusion mechanisms are unknown. In order to generalize results to the broader Puerto Rican population, adjustments are necessary to account for selection bias but are difficult to implement for NPS data. Although Bayesian latent class models enable summaries of dietary behavior variables through underlying patterns, they have not yet been adapted to the NPS setting. We propose a novel Weighted Overfitted Latent Class Analysis for Non-probability samples (WOLCAN). WOLCAN utilizes a quasi-randomization framework to (1) model pseudo-weights for an NPS using Bayesian additive regression trees (BART) and a reference probability sample, and (2) integrate the pseudo-weights within a weighted pseudo-likelihood approach for Bayesian latent class analysis, while propagating pseudo-weight uncertainty into parameter estimation. A stacked sample approach is used to allow shared individuals between the NPS and the
This study examines the social media uptake of scientific journals on two different platforms - X and WeChat - by comparing the adoption of X among journals indexed in the Science Citation Index-Expanded (SCIE) with the adoption of WeChat among journals indexed in the Chinese Science Citation Database (CSCD). The findings reveal substantial differences in platform adoption and user engagement, shaped by local contexts. While only 22.7% of SCIE journals maintain an X account, 84.4% of CSCD journals have a WeChat official account. Journals in Life Sciences & Biomedicine lead in uptake on both platforms, whereas those in Technology and Physical Sciences show high WeChat uptake but comparatively lower presence on X. User engagement on both platforms is dominated by low-effort interactions rather than more conversational behaviors. Correlation analyses indicate weak-to-moderate relationships between bibliometric indicators and social media metrics, confirming that online engagement reflects a distinct dimension of journal impact, whether on an international or a local platform. These findings underscore the need for broader social media metric frameworks that incorporate locally dom
I study how first sizable industry entries reshape local and neighboring labor markets in Puerto Rico. Using over a decade of quarterly municipality--industry data (2014Q1--2025Q1), I identify ``first sizable entries'' as large, persistent jumps in establishments, covered employment, and wage bill, and treat these as shocks to local industry presence at the municipio--industry level. Methodologically, I combine staggered-adoption difference-in-differences estimators that are robust to heterogeneous treatment timing with an imputation-based event-study approach, and I use a doubly robust difference-in-differences framework that explicitly allows for interference through pre-specified exposure mappings on a contiguity graph. The estimates show large and persistent direct gains in covered employment and wage bill in the treated municipality--industry cells over 0--16 quarters. Same-industry neighbors experience sizable short-run gains that reverse over the medium run, while within-municipality cross-industry and neighbor all-industries spillovers are small and imprecisely estimated. Once these spillovers are taken into account and spatially robust inference and sensitivity checks are
The uptake of machine learning (ML) approaches in the social and health sciences has been rather slow, and research using ML for social and health research questions remains fragmented. This may be due to the separate development of research in the computational/data versus social and health sciences as well as a lack of accessible overviews and adequate training in ML techniques for non data science researchers. This paper provides a meta-mapping of research questions in the social and health sciences to appropriate ML approaches, by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, and causal inference to common research goals, such as estimating prevalence of adverse health or social outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes. This meta-mapping aims at overcoming disciplinary barriers and starting a fluid dialogue between researchers from the social and health sciences and methodologically trained researchers. Such mapping may also help to fully exploit the benefits of ML while considering domain-specific aspects rele
Age-Period-Cohort (APC) models are of special importance in Demography and Epidemiology for analyzing panel data according to three different factors: biological (age), technological (period) and cultural (cohort). The main goal of APC modeling is to separate the explanation of both period and cohort effects to the phenomenon. The objective of this paper is to develop a Bayesian Age-Period-Cohort framework that can model a wide range of demographic and epidemiological phenomena and improve upon existing statistical methodologies. The APC framework consists of addressing three main challenges: (1) the identification problem of all APC models, usually managed by imposing constraints on effect groups, (2) considering expert knowledge in the model definition, and (3) efficient solution of computational issues. By allowing full parameter uncertainty, use of robust priors, and an efficient computational implementation, a Bayesian methodology manages these concerns. Bayesian models also produce results that allow intuitive implementation and support theoretical knowledge. Our original methodology consists of the use of (i) a Scaled Beta2 prior distribution for the scale parameters, (ii) i
Mobile health has the potential to revolutionize health care delivery and patient engagement. In this work, we discuss how integrating Artificial Intelligence into digital health applications-focused on supply chain, patient management, and capacity building, among other use cases-can improve the health system and public health performance. We present an Artificial Intelligence and Reinforcement Learning platform that allows the delivery of adaptive interventions whose impact can be optimized through experimentation and real-time monitoring. The system can integrate multiple data sources and digital health applications. The flexibility of this platform to connect to various mobile health applications and digital devices and send personalized recommendations based on past data and predictions can significantly improve the impact of digital tools on health system outcomes. The potential for resource-poor settings, where the impact of this approach on health outcomes could be more decisive, is discussed specifically. This framework is, however, similarly applicable to improving efficiency in health systems where scarcity is not an issue.
This paper explores a unique cave art found in southern Puerto Rico that depicts a comet over a tomb. Through interdisciplinary methods, including art interpretation, historical documentation, and demographic analysis, this study uncovered the artist's identity, the societal context of the period, and the potential motivations behind the creation of this art. The investigation revealed a connection to the passage of Halley's Comet in 1910 and the widespread panic it induced.
The journal structure in the China Scientific and Technical Papers and Citations Database (CSTPCD) is analysed from three perspectives: the database level, the specialty level and the institutional level (i.e., university journals versus journals issued by the Chinese Academy of Sciences). The results are compared with those for (Chinese) journals included in the Science Citation Index. The frequency of journal-journal citation relations in the CSTPCD is an order of magnitude lower than in the SCI. Chinese journals, especially high-quality journals, prefer to cite international journals rather than domestic ones. However, Chinese journals do not get an equivalent reception from their international counterparts. The international visibility of Chinese journals is low, but varies among fields of science. Journals of the Chinese Academy of Sciences (CAS) have a better reception in the international scientific community than university journals.
We compare the network of aggregated journal-journal citation relations provided by the Journal Citation Reports (JCR) 2012 of the Science and Social Science Citation Indexes (SCI and SSCI) with similar data based on Scopus 2012. First, global maps were developed for the two sets separately; sets of documents can then be compared using overlays to both maps. Using fuzzy-string matching and ISSN numbers, we were able to match 10,524 journal names between the two sets; that is, 96.4% of the 10,936 journals contained in JCR or 51.2% of the 20,554 journals covered by Scopus. Network analysis was then pursued on the set of journals shared between the two databases and the two sets of unique journals. Citations among the shared journals are more comprehensively covered in JCR than Scopus, so the network in JCR is denser and more connected than in Scopus. The ranking of shared journals in terms of indegree (that is, numbers of citing journals) or total citations is similar in both databases overall (Spearman's \r{ho} > 0.97), but some individual journals rank very differently. Journals that are unique to Scopus seem to be less important--they are citing shared journals rather than bein
The traditional time series methodology requires at least a preliminary transformation of the data to get stationarity. On the other hand, Robust Bayesian Dynamic Models (RBDMs) do not assume a regular pattern or stability of the underlying system but can include points of statement breaks. In this paper we use RBDMs in order to account possible outliers and structural breaks in Latin-American economic time series. We work with important economic time series from Puerto Rico and Mexico. We show by using a random walk model how RBDMs can be applied for detecting historic changes in the economic inflation of Mexico. Also, we model the Consumer Price Index (CPI), the Economic Activity Index (EAI) and the total number of employments (TNE) economic time series in Puerto Rico using local linear trend and seasonal RBDMs with observational and states variances. The results illustrate how the model accounts the structural breaks for the historic recession periods in Puerto Rico.
Selecting the right monitoring level in Remote Patient Monitoring (RPM) systems for e-healthcare is crucial for balancing patient outcomes, various resources, and patient's quality of life. A prior work has used one-dimensional health representations, but patient health is inherently multidimensional and typically consists of many measurable physiological factors. In this paper, we introduce a multidimensional health state model within the RPM framework and use dynamic programming to study optimal monitoring strategies. Our analysis reveals that the optimal control is characterized by switching curves (for two-dimensional health states) or switching hyper-surfaces (in general): patients switch to intensive monitoring when health measurements cross a specific multidimensional surface. We further study how the optimal switching curve varies for different medical conditions and model parameters. This finding of the optimal control structure provides actionable insights for clinicians and aids in resource planning. The tunable modeling framework enhances the applicability and effectiveness of RPM services across various medical conditions.
The abrupt decline in the Total Fertility Rate (TFR) of Puerto Rico since 2000 makes the prospect of a sustained population decline a real possibility. From 2000 to 2021 the TFR declined from 2.1 to 0.9 children per woman, one of the lowest in the world. Population projections produced by the United States Census Bureau and the United Nations Population Division show that the island population may decline from 3.8 millions in 2000 to slightly above 2 million by 2050, a dramatic 47% population decline in 50 years. As dire as this prospect may be, this may be an optimistic scenario. Both projections have the TFR increasing to 1.5 by 2050, but a fertility projection conducted by us show that fertility can remain much closer to 1.0 until 2050. Bayesian Hierarchical Probabilistic Theory has been used by the United Nations to incorporate a way to measure the uncertainty and to estimate the projection parameters. However, the assumption that the fertility level in countries with low fertility will eventually increase to 2.1 has been widely criticized as unrealistic and not supported by evidence. We modified the assumptions used by the United Nations considering countries with TFR similar
This study examines the role of top-tier conference publications in Hungarian computer science research. We show that the national scientometric practice, which is currently journal-oriented, diverges from international norms, creating incentive distortions in researcher evaluation. By linking multiple databases (iCore, DBLP, MTMT, MTA-ATT), we mapped Hungarian-affiliated CORE A* and A conference papers, their temporal and thematic distribution, and author trajectories. Our results indicate that, in theoretical fields, publishing at international conferences became common earlier than in applied fields. At the same time, in applied fields, successful researchers are more likely to continue their careers in foreign institutions or in industry positions. Overall, a substantial share of the already established, internationally most successful researchers are now affiliated with institutions abroad. We recommend recognizing CORE A* papers as equivalent to D1 and CORE A papers as equivalent to Q1 journals in national evaluation systems.
Using three years of the Journal Citation Reports (2011, 2012, and 2013), indicators of transitions in 2012 (between 2011 and 2013) are studied using methodologies based on entropy statistics. Changes can be indicated at the level of journals using the margin totals of entropy production along the row or column vectors, but also at the level of links among journals by importing the transition matrices into network analysis and visualization programs (and using community-finding algorithms). Seventy-four journals are flagged in terms of discontinuous changes in their citations; but 3,114 journals are involved in "hot" links. Most of these links are embedded in a main component; 78 clusters (containing 172 journals) are flagged as potential "hot spots" emerging at the network level. An additional finding is that PLoS ONE introduced a new communication dynamics into the database. The limitations of the methodology are elaborated using an example. The results of the study indicate where developments in the citation dynamics can be considered as significantly unexpected. This can be used as heuristic information; but what a "hot spot" in terms of the entropy statistics of aggregated cit
The Oregon Health Insurance Experiment (OHIE) offers a unique opportunity to examine the causal relationship between Medicaid coverage and happiness among low-income adults, using an experimental design. This study leverages data from comprehensive surveys conducted at 0 and 12 months post-treatment. Previous studies based on OHIE have shown that individuals receiving Medicaid exhibited a significant improvement in mental health compared to those who did not receive coverage. The primary objective is to explore how Medicaid coverage impacts happiness, specifically analyzing in which direction variations in healthcare spending significantly improve mental health: higher spending or lower spending after Medicaid. Utilizing instrumental variable (IV) regression, I conducted six separate regressions across subgroups categorized by expenditure levels and happiness ratings, and the results reveal distinct patterns. Enrolling in OHP has significantly decreased the probability of experiencing unhappiness, regardless of whether individuals had high or low medical spending. Additionally, it decreased the probability of being pretty happy and having high medical expenses, while increasing the
This research paper presents a meta-analysis of the multifaceted role of technology in mental health. The pervasive influence of technology on daily lives necessitates a deep understanding of its impact on mental health services. This study synthesizes literature covering Behavioral Intervention Technologies (BITs), digital mental health interventions during COVID-19, young men's attitudes toward mental health technologies, technology-based interventions for university students, and the applicability of mobile health technologies for individuals with serious mental illnesses. BITs are recognized for their potential to provide evidence-based interventions for mental health conditions, especially anxiety disorders. The COVID-19 pandemic acted as a catalyst for the adoption of digital mental health services, underscoring their crucial role in providing accessible and quality care; however, their efficacy needs to be reinforced by workforce training, high-quality evidence, and digital equity. A nuanced understanding of young men's attitudes toward mental health is imperative for devising effective online services. Technology-based interventions for university students are promising, al
Image recaptioning is widely used to generate training datasets with enhanced quality for various multimodal tasks. Existing recaptioning methods typically rely on powerful multimodal large language models (MLLMs) to enhance textual descriptions, but often suffer from inaccuracies due to hallucinations and incompleteness caused by missing fine-grained details. To address these limitations, we propose RICO, a novel framework that refines captions through visual reconstruction. Specifically, we leverage a text-to-image model to reconstruct a caption into a reference image, and prompt an MLLM to identify discrepancies between the original and reconstructed images to refine the caption. This process is performed iteratively, further progressively promoting the generation of more faithful and comprehensive descriptions. To mitigate the additional computational cost induced by the iterative process, we introduce RICO-Flash, which learns to generate captions like RICO using DPO. Extensive experiments demonstrate that our approach significantly improves caption accuracy and completeness, outperforms most baselines by approximately 10% on both CapsBench and CompreCap. Code released at https