共找到 20 条结果
We model the problem of optimizing the schedule of courses a student at the American College of Greece will need to take to complete their studies. We model all constraints set forth by the institution and the department, so that we guarantee the validity of all produced schedules. We formulate several different objectives to optimize in the resulting schedule, including fastest completion time, course difficulty balance, and so on, with a very important objective our model is capable of capturing being the maximization of the expected student GPA given their performance on passed courses using Machine Learning and Data Mining techniques. All resulting problems are Mixed Integer Linear Programming problems with a number of binary variables that is in the order of the maximum number of terms times the number of courses available for the student to take. The resulting Mathematical Programming problem is always solvable by the GUROBI solver in less than 10 seconds on a modern commercial off-the-self PC, whereas the manual process that was installed before used to take department heads that are designated as student advisors more than one hour of their time for every student and was re
Data science has become increasingly essential for the production of official statistics, as it enables the automated collection, processing, and analysis of large amounts of data. With such data science practices in place, it enables more timely, more insightful and more flexible reporting. However, the quality and integrity of data-science-driven statistics rely on the accuracy and reliability of the data sources and the machine learning techniques that support them. In particular, changes in data sources are inevitable to occur and pose significant risks that are crucial to address in the context of machine learning for official statistics. This paper gives an overview of the main risks, liabilities, and uncertainties associated with changing data sources in the context of machine learning for official statistics. We provide a checklist of the most prevalent origins and causes of changing data sources; not only on a technical level but also regarding ownership, ethics, regulation, and public perception. Next, we highlight the repercussions of changing data sources on statistical reporting. These include technical effects such as concept drift, bias, availability, validity, accur
There is debate over whether Asian American students are admitted to selective colleges and universities at lower rates than white students with similar academic qualifications. However, there have been few empirical investigations of this issue, in large part due to a dearth of data. Here we present the results from analyzing 685,709 applications from Asian American and white students to a subset of selective U.S. institutions over five application cycles, beginning with the 2015-2016 cycle. The dataset does not include admissions decisions, and so we construct a proxy based in part on enrollment choices. Based on this proxy, we estimate the odds that Asian American applicants were admitted to at least one of the schools we consider were 28% lower than the odds for white students with similar test scores, grade-point averages, and extracurricular activities. The gap was particularly pronounced for students of South Asian descent (49% lower odds). We trace this pattern in part to two factors. First, many selective colleges openly give preference to the children of alumni, and we find that white applicants were substantially more likely to have such legacy status than Asian applican
A previous study of symmetric collisions of massive nuclei has shown that current models of multi-nucleon transfer (MNT) reactions do not adequately describe the transfer product yields. To gain further insight into this problem, we have measured the yields of MNT products in the interaction of 977 (E/A = 4.79 MeV) and 1143 MeV (E/A = 5.60 MeV) $^{204}$Hg with $^{208}$Pb. We find that the yield of multi-nucleon transfer products are similar in these two reactions and are substantially lower than those observed in the reaction of 1257 MeV (E/A = 6.16 MeV) $^{204}$Hg + $^{198}$Pt. We compare our measurements with the predictions of the GRAZING-F, di-nuclear systems (DNS) and improved quantum molecular dynamics (ImQMD) models. For the observed isotopes of the elements Au, Hg, Tl, Pb and Bi, the measured values of the MNT cross sections are orders of magnitude larger than the predicted values. Furthermore, the various models predict the formation of nuclides near the N=126 shell, which are not observed.
Mobile phone data are an interesting new data source for official statistics. However, multiple problems and uncertainties need to be solved before these data can inform, support or even become an integral part of statistical production processes. In this paper, we focus on arguably the most important problem hindering the application of mobile phone data in official statistics: detecting home locations. We argue that current efforts to detect home locations suffer from a blind deployment of criteria to define a place of residence and from limited validation possibilities. We support our argument by analysing the performance of five home detection algorithms (HDAs) that have been applied to a large, French, Call Detailed Record (CDR) dataset (~18 million users, 5 months). Our results show that criteria choice in HDAs influences the detection of home locations for up to about 40% of users, that HDAs perform poorly when compared with a validation dataset (the 35°-gap), and that their performance is sensitive to the time period and the duration of observation. Based on our findings and experiences, we offer several recommendations for official statistics. If adopted, our recommendatio
Phosphorus (P) is considered to be one of the key elements for life, making it an important element to look for in the abundance analysis of spectra of stellar systems. Yet, there exists only a handful of spectroscopic studies to estimate the P abundances and investigate its trend across a range of metallicities. We have observed full HK band spectra at a spectral resolving power of R=45,000 with IGRINS instrument. Abundances are determined using SME in combination with 1D MARCS stellar atmosphere models. The investigated sample of stars have reliable stellar parameters estimated using optical FIES spectra (GILD; Jönsson et al. in prep.). In order to determine the P abundances from the 16482.92 Angstrom P line, we take special care of the CO($ν=7-4$) blend. We determine the C, N, O abundances from atomic carbon and a range of non-blended molecular lines (CO, CN, OH) which are aplenty in the H band region of K giant stars, assuring an appropriate modelling of the blending CO($ν=7-4$) line. We present [P/Fe] vs [Fe/H] trend for 38 K giant stars in the metallicity range of -1.2 dex $<$ [Fe/H] $<$ 0.4 dex. We find that our trend matches well with the compiled literature sample of
National statistical institutes currently investigate how to improve the output quality of official statistics based on machine learning algorithms. A key obstacle is concept drift, i.e., when the joint distribution of independent variables and a dependent (categorical) variable changes over time. Under concept drift, a statistical model requires regular updating to prevent it from becoming biased. However, updating a model asks for additional data, which are not always available. In the literature, we find a variety of bias correction methods as a promising solution. In the paper, we will compare two popular correction methods: the misclassification estimator and the calibration estimator. For prior probability shift (a specific type of concept drift), we investigate the two correction methods theoretically as well as experimentally. Our theoretical results are expressions for the bias and variance of both methods. As experimental result, we present a decision boundary (as a function of (a) model accuracy, (b) class distribution and (c) test set size) for the relative performance of the two methods. Close inspection of the results will provide a deep insight into the effect of pri
We report on the gamma-ray activity of the blazar Mrk 501 during the first 480 days of Fermi operation. We find that the average LAT gamma-ray spectrum of Mrk 501 can be well described by a single power-law function with a photon index of 1.78 +/- 0.03. While we observe relatively mild flux variations with the Fermi-LAT (within less than a factor of 2), we detect remarkable spectral variability where the hardest observed spectral index within the LAT energy range is 1.52 +/- 0.14, and the softest one is 2.51 +/- 0.20. These unexpected spectral changes do not correlate with the measured flux variations above 0.3GeV. In this paper, we also present the first results from the 4.5-month-long multifrequency campaign (2009 March 15 - August 1) on Mrk 501, which included the VLBA, Swift, RXTE, MAGIC and VERITAS, the F-GAMMA, GASP-WEBT, and other collaborations and instruments which provided excellent temporal and energy coverage of the source throughout the entire campaign. The average spectral energy distribution of Mrk 501 is well described by the standard one-zone synchrotron self-Compton model. In the framework of this model, we find that the dominant emission region is characterized b
Binomial tree methods (BTM) and explicit difference schemes (EDS) for the variational inequality model of American options with time dependent coefficients are studied. When volatility is time dependent, it is not reasonable to assume that the dynamics of the underlying asset's price forms a binomial tree if a partition of time interval with equal parts is used. A time interval partition method that allows binomial tree dynamics of the underlying asset's price is provided. Conditions under which the prices of American option by BTM and EDS have the monotonic property on time variable are found. Using convergence of EDS for variational inequality model of American options to viscosity solution the decreasing property of the price of American put options and increasing property of the optimal exercise boundary on time variable are proved. First, put options are considered. Then the linear homogeneity and call-put symmetry of the price functions in the BTM and the EDS for the variational inequality model of American options with time dependent coefficients are studied and using them call options are studied.
In this work, we expand the idea of Samuelson[3] and Shepp[2,5,6] for stock optimization using the Bachelier model [4] as our models for the stock price at the money (X[stock price]= K[strike price]) for the American call and put options [1]. At the money (X= K) for American options, the expected payoff of both the call and put options is zero. Shepp investigated several stochastic optimization problems using martingale and stopping time theories [2,5,6]. One of the problems he investigated was how to optimize the stock price using both the Black-Scholes (multiplicative) and the Bachelier (additive) models [7,6] for the American option above the strike price K (exercise price) to a stopping point. In order to explore the non-relativistic quantum effect on the expected payoff for both the call and put options at the money, we assumed the stock price to undergo a stochastic process governed by the Bachelier (additive) model [4]. Further, using Ito calculus and martingale theory, we obtained a differential equation for the expected payoff for both the call and put options in terms of delta and gamma. We also obtained the solution to the non-relativistic Schroedinger equation as the ex
We have investigated the toroidal analog of ellipsoidal shells of matter, which are of great significance in Astrophysics. The exact formula for the gravitational potential $Ψ(R,Z)$ of a shell with a circular section at the pole of toroidal coordinates is first established. It depends on the mass of the shell, its main radius and axis-ratio $e$ (i.e. core-to-main radius ratio), and involves the product of the complete elliptic integrals of the first and second kinds. Next, we show that successive partial derivatives $\partial^{n +m} Ψ/\partial_{R^n} \partial_{Z^m}$ are also accessible by analytical means at that singular point, thereby enabling the expansion of the interior potential as a bivariate series. Then, we have generated approximations at orders $0$, $1$, $2$ and $3$, corresponding to increasing accuracy. Numerical experiments confirm the great reliability of the approach, in particular for small-to-moderate axis ratios ($e^2 \lesssim 0.1$ typically). In contrast with the ellipsoidal case (Newton's theorem), the potential is not uniform inside the shell cavity as a consequence of the curvature. We explain how to construct the interior potential of toroidal shells with a th
Experimental data on the shape of hadronic momentum spectra are compared to theoretical predictions in the context of calculations in the Modified Leading Log Approximation (MLLA), under the assumption of Local Parton Hadron Duality (LPHD). Considered are experimental measurements at $e^+e^-$-colliders of $ξ_p^*$, the position of the maximum in the distribution of $ξ_p=\log(1/x_p)$, where $x_p=p/p_{beam}$. The parameter $ξ_p^*$ is determined for various hadrons at various centre of mass energies. The dependence on the hadron type poses some interesting questions about the process of hadron-formation. The dependence of $ξ^*_p$ on the centre of mass energy is seen to be described adequately by perturbation theory. A quantitative check of LPHD + MLLA is possible by extracting a value of $α_s$ from an overall fit to the scaling behaviour of $ξ^*_p$.
We analyze data from the Hydrogen Epoch of Reionization Array. This is the third in a series of papers on the closure phase delay-spectrum technique designed to detect the HI 21cm emission from cosmic reionization. We present the details of the data and models employed in the power spectral analysis, and discuss limitations to the process. We compare images and visibility spectra made with HERA data, to parallel quantities generated from sky models based on the GLEAM survey, incorporating the HERA telescope model. We find reasonable agreement between images made from HERA data, with those generated from the models, down to the confusion level. For the visibility spectra, there is broad agreement between model and data across the full band of $\sim 80$MHz. However, models with only GLEAM sources do not reproduce a roughly sinusoidal spectral structure at the tens of percent level seen in the observed visibility spectra on scales $\sim 10$ MHz on 29 m baselines. We find that this structure is likely due to diffuse Galactic emission, predominantly the Galactic plane, filling the far sidelobes of the antenna primary beam. We show that our current knowledge of the frequency dependence o
Eccentric planets may spend a significant portion of their orbits at large distances from their host stars, where low temperatures can cause atmospheric CO2 to condense out onto the surface, similar to the polar ice caps on Mars. The radiative effects on the climates of these planets throughout their orbits would depend on the wavelength-dependent albedo of surface CO2 ice that may accumulate at or near apoastron and vary according to the spectral energy distribution of the host star. To explore these possible effects, we incorporated a CO2 ice-albedo parameterization into a one-dimensional energy balance climate model. With the inclusion of this parameterization, our simulations demonstrated that F-dwarf planets require 29% more orbit-averaged flux to thaw out of global water ice cover compared with simulations that solely use a traditional pure water ice-albedo parameterization. When no eccentricity is assumed, and host stars are varied, F-dwarf planets with higher bond albedos relative to their M-dwarf planet counterparts require 30% more orbit-averaged flux to exit a water snowball state. Additionally, the intense heat experienced at periastron aids eccentric planets in exiting
The 24 September 2001 College Park, Maryland, tornado was remarkable because of its long-track that passed within a close range of two Doppler radars. This tornado featured many similarities to previous significant tornado events that resulted in widespread damage in urban areas, such as the Oklahoma City tornado of 3 May 1999. The College Park tornado was the third in a series of three tornadoes associated with a supercell storm that developed over central Virginia. This paper presents a synoptic and mesoscale overview of favorable conditions and forcing mechanisms that resulted in the severe convective outbreak associated with the College Park tornado. Convective morphology will be examined in terms of Doppler radar and satellite imagery. This study concludes with a discussion of the effectiveness of using MM5 guidance in conjunction with satellite and radar imagery in the operational environment of forecasting severe convection.
Given the importance of accurate team rankings in American college football (CFB) -- due to heavy title and playoff implications -- strides have been made to improve evaluation metrics across statistical categories, going from basic averages (e.g. points scored per game) to metrics that adjust for a team's strength of schedule, but one aspect that hasn't been emphasized is the complementary nature of American football. Despite the same team's offensive and defensive units typically consisting of separate player sets, some aspects of your team's defensive (offensive) performance may affect the complementary side: turnovers forced by your defense could lead to easier scoring chances for your offense, while your offense's ability to control the clock may help your defense. For 2009-2019 CFB seasons, we incorporate natural splines with group penalty approaches to identify the most consistently influential features of complementary football in a data-driven way, conducting partially constrained optimization in order to additionally guarantee the full adjustment for strength of schedule and homefield factor. We touch on the issues arising due to reverse-causal nature of certain within-ga
We present the results of processing the effects of the powerful Gamma Ray Burst GRB221009A captured by the charged particle detectors (electrostatic analyzers and solid-state detectors) onboard spacecraft at different points in the heliosphere on October 9, 2022. To follow the GRB221009A propagation through the heliosphere we used the electron and proton flux measurements from solar missions Solar Orbiter and STEREO-A; Earth magnetosphere and the solar wind missions THEMIS and Wind; meteorological satellites POES15, POES19, MetOp3; and MAVEN - a NASA mission orbiting Mars. GRB221009A had a structure of four bursts: less intense Pulse 1 - the triggering impulse - was detected by gamma-ray observatories at 131659 UT (near the Earth); the most intense Pulses 2 and 3 were detected on board all the spacecraft from the list, and Pulse 4 detected in more than 500 s after Pulse 1. Due to their different scientific objectives, the spacecraft, which data was used in this study, were separated by more than 1 AU (Solar Orbiter and MAVEN). This enabled tracking GRB221009A as it was propagating across the heliosphere. STEREO-A was the first to register Pulse 2 and 3 of the GRB, almost 100 secon
Recently, the pandemic of the novel Coronavirus Disease-2019 (COVID-19) has presented governments with ultimate challenges. In the United States, the country with the highest confirmed COVID-19 infection cases, a nationwide social distancing protocol has been implemented by the President. For the first time in a hundred years since the 1918 flu pandemic, the US population is mandated to stay in their households and avoid public contact. As a result, the majority of public venues and services have ceased their operations. Following the closure of the University of Washington on March 7th, more than a thousand colleges and universities in the United States have cancelled in-person classes and campus activities, impacting millions of students. This paper aims to discover the social implications of this unprecedented disruption in our interactive society regarding both the general public and higher education populations by mining people's opinions on social media. We discover several topics embedded in a large number of COVID-19 tweets that represent the most central issues related to the pandemic, which are of great concerns for both college students and the general public. Moreover,
While an integration by parts formula for the bilinear form of the hypersingular boundary integral operator for the transient heat equation in three spatial dimensions is available in the literature, a proof of this formula seems to be missing. Moreover, the available formula contains an integral term including the time derivative of the fundamental solution of the heat equation, whose interpretation is difficult at second glance. To fill these gaps we provide a rigorous proof of a general version of the integration by parts formula and an alternative representation of the mentioned integral term, which is valid for a certain class of functions including the typical tensor-product discretization spaces.
We present coordinated multiwavelength observations of the bright, nearby BL Lac object Mrk 421 taken in 2013 January-March, involving GASP-WEBT, Swift, NuSTAR, Fermi-LAT, MAGIC, VERITAS, and other collaborations and instruments, providing data from radio to very-high-energy (VHE) gamma-ray bands. NuSTAR yielded previously unattainable sensitivity in the 3-79 keV range, revealing that the spectrum softens when the source is dimmer until the X-ray spectral shape saturates into a steep power law with a photon index of approximately 3, with no evidence for an exponential cutoff or additional hard components up to about 80 keV. For the first time, we observed both the synchrotron and the inverse-Compton peaks of the spectral energy distribution (SED) simultaneously shifted to frequencies below the typical quiescent state by an order of magnitude. The fractional variability as a function of photon energy shows a double-bump structure which relates to the two bumps of the broadband SED. In each bump, the variability increases with energy which, in the framework of the synchrotron self-Compton model, implies that the electrons with higher energies are more variable. The measured multi-ban