Data pre-processing is a significant step in machine learning to improve the performance of the model and decreases the running time. This might include dealing with missing values, outliers detection and removing, data augmentation, dimensionality reduction, data normalization and handling the impact of confounding variables. Although it is found the steps improve the accuracy of the model, but they might hinder the explainability of the model if they are not carefully considered especially in medicine. They might block new findings when missing values and outliers removal are implemented inappropriately. In addition, they might make the model unfair against all the groups in the model when making the decision. Moreover, they turn the features into unitless and clinically meaningless and consequently not explainable. This paper discusses the common steps of the data preprocessing in machine learning and their impacts on the explainability and interpretability of the model. Finally, the paper discusses some possible solutions that improve the performance of the model while not decreasing its explainability.
We consider small-scale jet-like events that might make the solar wind, as has been suggested in recent studies. We show that the events referred to as "coronal jets" and as "jetlets" both fall on a power-law distribution that also includes large-scale eruptions and spicule-sized features; all of the jet-like events could contribute to the solar wind. Based on imaging and magnetic field data, it is plausible that many or most of these events might form by the same mechanism: Magnetic flux cancelation produces small-scale flux ropes, often containing a cool-material minifilament. This minifilament/flux rope erupts and reconnects with adjacent open coronal field, along which "plasma jets" flow and contribute to the solar wind. The erupting flux ropes can contain twist that is transferred to the open field, and these become Alfvénic pulses that form magnetic switchbacks, providing an intrinsic connection between switchbacks and the production of the solar wind.
This position paper argues that effectively "democratizing AI" requires democratic governance and alignment of AI, and that this is particularly valuable for decisions with systemic societal impacts. Initial steps -- such as Meta's Community Forums and Anthropic's Collective Constitutional AI -- have illustrated a promising direction, where democratic processes could be used to meaningfully improve public involvement and trust in critical decisions. To more concretely explore what increasingly democratic AI might look like, we provide a "Democracy Levels" framework and associated tools that: (i) define milestones toward meaningfully democratic AI, which is also crucial for substantively pluralistic, human-centered, participatory, and public-interest AI, (ii) can help guide organizations seeking to increase the legitimacy of their decisions on difficult AI governance and alignment questions, and (iii) support the evaluation of such efforts.
The possibility that nuclear matter might be Quarkyonic is considered. Quarkyonic matter is high baryon density matter that is confined but can be approximately thought of as a filled Fermi sea of quarks surrounded by a shell of nucleons. Here, nuclear matter is described by the IdylliQ sigma model for Quarkyonic matter, generalizing the non-interacting IdylliQ model [Y. Fujimoto et al., Phys. Rev. Lett. 132, 112701 (2024) [arXiv:2306.04304]] to include interactions with a sigma meson and a pion. When such interactions are included, we find that isospin-symmetric nuclear matter binds, with acceptable values of the compressibility and other parameters for nuclear matter at saturation. The energy per nucleon and sound velocity of such matter is computed, and the isospin dependence is determined. Nuclear matter is formed at a density close to but slightly above the density at which Quarkyonic matter forms. Quarkyonic matter predicts a strong depletion of nucleons in normal nuclear matter at low momentum. Such a depletion for nucleon momenta $k \lesssim 120$ MeV is shown to be consistent with electron scattering data.
Private closeness testing asks to decide whether the underlying probability distributions of two sensitive datasets are identical or differ significantly in statistical distance, while guaranteeing (differential) privacy of the data. As in most (if not all) distribution testing questions studied under privacy constraints, however, previous work assumes that the two datasets are equally sensitive, i.e., must be provided the same privacy guarantees. This is often an unrealistic assumption, as different sources of data come with different privacy requirements; as a result, known closeness testing algorithms might be unnecessarily conservative, "paying" too high a privacy budget for half of the data. In this work, we initiate the study of the closeness testing problem under heterogeneous privacy constraints, where the two datasets come with distinct privacy requirements. We formalize the question and provide algorithms under the three most widely used differential privacy settings, with a particular focus on the local and shuffle models of privacy; and show that one can indeed achieve better sample efficiency when taking into account the two different "epsilon" requirements.
Our previous research has confirmed that the USD/JPY rate tends to rise toward 9:55 every morning in the Gotobi days, which are divisible by five. This is called the Gotobi anomaly. In the present study, we verify the possible trading strategy and its validity under the condition that investors recognize the existence of the anomaly. Moreover, we illustrate the possibility that the wealth of Japanese companies might leak to FX traders due to the arbitrage opportunity if Japanese companies blindly keep making payments in the Gotobi days as a business custom.
Radial velocity surveys suggest that the Solar System may be unusual and that Jupiter-like planets have a frequency <20% around solar-type stars. However, they may be much more common in one of the closest associations in the solar neighbourhood. Young moving stellar groups are the best targets for direct imaging of exoplanets and four massive Jupiter-like planets have been already discovered in the nearby young beta Pic Moving Group (BPMG) via high-contrast imaging, and four others were suggested via high precision astrometry by the European Space Agency's Gaia satellite. Here we analyze 30 stars in BPMG and show that 20 of them might potentially host a Jupiter-like planet as their orbits would be stable. Considering incompleteness in observations, our results suggest that Jupiter-like planets may be more common than previously found. The next Gaia data release will likely confirm our prediction.
The susceptibility of modern machine learning classifiers to adversarial examples has motivated theoretical results suggesting that these might be unavoidable. However, these results can be too general to be applicable to natural data distributions. Indeed, humans are quite robust for tasks involving vision. This apparent conflict motivates a deeper dive into the question: Are adversarial examples truly unavoidable? In this work, we theoretically demonstrate that a key property of the data distribution -- concentration on small-volume subsets of the input space -- determines whether a robust classifier exists. We further demonstrate that, for a data distribution concentrated on a union of low-dimensional linear subspaces, utilizing structure in data naturally leads to classifiers that enjoy data-dependent polyhedral robustness guarantees, improving upon methods for provable certification in certain regimes.
The hypothesis of the human brain operation in vicinity of a critical point has been a matter of a hot debate in the recent years. The evidence for a possibility of a naturally occurring phase transition across this critical point was missing so far. Here we show that love might be an example of such second-order phase transition. This hypothesis allows to describe both love at first sight and love from liking or friendship.
As a cheap and safe antimalarial agent, chloroquine (CQ) has been used in the battle against malaria for more than half century. However, the mechanism of CQ action and resistance in Plasmodium falciparum remains elusive. Based on further analysis of our published experimental results, we propose that the mechanism of CQ action and resistance might be closely linked with cell-cycle-associated amplified genomic-DNA fragments (CAGFs, singular form = CAGF) as CQ induces CAGF production in P. falciparum, which could affect multiple biological processes of the parasite, and thus might contribute to parasite death and CQ resistance. Recently, we found that CQ induced one of CAGFs, UB1- CAGF, might downregulate a probable P. falciparum cystine transporter (Pfct) gene expression, which could be used to understand the mechanism of CQ action and resistance in P. falciparum.
Standard losses for training deep segmentation networks could be seen as individual classifications of pixels, instead of supervising the global shape of the predicted segmentations. While effective, they require exact knowledge of the label of each pixel in an image. This study investigates how effective global geometric shape descriptors could be, when used on their own as segmentation losses for training deep networks. Not only interesting theoretically, there exist deeper motivations to posing segmentation problems as a reconstruction of shape descriptors: Annotations to obtain approximations of low-order shape moments could be much less cumbersome than their full-mask counterparts, and anatomical priors could be readily encoded into invariant shape descriptions, which might alleviate the annotation burden. Also, and most importantly, we hypothesize that, given a task, certain shape descriptions might be invariant across image acquisition protocols/modalities and subject populations, which might open interesting research avenues for generalization in medical image segmentation. We introduce and formulate a few shape descriptors in the context of deep segmentation, and evaluate
The discussion between the automotive industry, governments, ethicists, policy makers and general public about autonomous cars' moral agency is widening, and therefore we see the need to bring more insight into what meta-factors might actually influence the outcomes of such discussions, surveys and plebiscites. In our study, we focus on the psychological (personality traits), practical (active driving experience), gender and rhetoric/framing factors that might impact and even determine respondents' a priori preferences of autonomous cars' operation. We conducted an online survey (N=430) to collect data that show that the third person scenario is less biased than the first person scenario when presenting ethical dilemma related to autonomous cars. According to our analysis, gender bias should be explored in more extensive future studies as well. We recommend any participatory technology assessment discourse to use the third person scenario and to direct attention to the way any autonomous car related debate is introduced, especially in terms of linguistic and communication aspects and gender.
The Hermean average perihelion rate $\dotω^\mathrm{2PN}$, calculated to the second post-Newtonian (2PN) order with the Gauss perturbing equations and the osculating Keplerian orbital elements, ranges from $-18$ to $-4$ microarcseconds per century $\left(μ\mathrm{as\,cty}^{-1}\right)$, depending on the true anomaly at epoch $f_0$. It is the sum of four contributions: one of them is the direct consequence of the 2PN acceleration entering the equations of motion, while the other three are indirect effects of the 1PN component of the Sun's gravitational field. An evaluation of the merely formal uncertainty of the experimental Mercury's perihelion rate $\dotω_\mathrm{exp}$ recently published by the present author, based on 51 years of radiotechnical data processed with the EPM2017 planetary ephemerides by the astronomers E.V. Pitjeva and N.P. Pitjev, is $σ_{\dotω_\mathrm{exp}}\simeq 8\,μ\mathrm{as\,cty}^{-1}$, corresponding to a relative accuracy of $2\times 10^{-7}$ for the combination $\left(2 + 2γ- β\right)/3$ of the PPN parameters $β$ and $γ$ scaling the well known 1PN perihelion precession. In fact, the realistic uncertainty may be up to $\simeq 10-50$ times larger, despite reproce
Low-mass helium-core white-dwarfs (WDs) with masses below 0.5 Msun are known to be formed in binary star systems but unexpectedly a significant fraction of them seem to be single. On the other hand, in Cataclysmic Variables (CVs) a large number of low-mass WD primary stars is predicted but not observed. We recently showed that the latter problem can be solved if consequential angular momentum loss causes especially CVs with low-mass WDs to merge and form single stars. Here we simulate the population of single WDs resulting from single star evolution and from binary star mergers taking into account these new merging CVs. We show that according to the revised model of CV evolution, merging CVs might be the dominant channel leading to the formation of low-mass single WDs and that the predicted relative numbers are consistent with observations. This can be interpreted as further evidence for the revised model of CV evolution we recently suggested. This model includes consequential angular momentum loss that increases with decreasing WD mass and might not only explain the absence of low-mass WD primaries in CVs but also the existence of single low-mass WDs.
It is proposed that the `bare' strange matter stars might not be bare, and radio pulsars might be in fact `bare' strange stars. As strange matter stars being intensely magnetized rotate, the induced unipolar electric fields would be large enough to construct magnetospheres. This situation is very similar to that discussed by many authors for rotating neutron stars. Also, the strange stars with accretion crusts in binaries could act as X-ray pulsars or X-ray bursters. There are some advantages if radio pulsars are `bare' strange stars.
Learning physics is a context dependent process. I consider a broader interdisciplinary problem of where differences in understanding and reasoning arise. I suggest the long run effects a multiple choice based learning system as well as society cultural habits and rules might have on student reasoning structure.
Site-directed mutagenesis refers to a man-made molecular biology method that is used to make genetic alterations in the DNA sequence of a gene of interest. But based on our recently published experimental findings, we propose that natural site-directed mutagenesis might exist in the eukaryotic cells, which is triggered by harmful agents and co-directed by special transcription hotspots and mutation-contained intranuclear primers.
We perform a theoretical analysis of the $a_1$ resonance mass spectrum in ultra-relativistic heavy ion collisions within a hadron/string transport approach. Predictions for the $a_1$ yield and its mass distribution are given for the GSI-FAIR and the critRHIC energy regime. The potential of the $a_1$ meson as a signal for chiral symmetry restoration is explored. In view of the latest discussion, we investigate the decay channel $a_1 \to γπ$ in detail and find a strong bias towards low $a_1$ masses. This apparent mass shift of the $a_1$ if observed in the $γπ$ channel might render a possible mass shift due to chiral symmetry restoration very difficult to disentangle from the decay kinematics.
We claim that there might exist a new interaction leading to very fast baryon-number violating processes quite observable in the laboratory conditions, provided all three generations are simultaneously involved.
Intermediate-mass black holes (BHs) in local dwarf galaxies are considered the relics of the early seed BHs. However, their growth might have been impacted by galaxy mergers and BH feedback so that they cannot be treated as tracers of the early seed BH population.