Large language models (LLMs) have shown high agreement with human raters across a variety of tasks, demonstrating potential to ease the challenges of human data collection. In computational social science (CSS), researchers are increasingly leveraging LLM annotations to complement slow and expensive human annotations. Still, guidelines for collecting and using LLM annotations, without compromising the validity of downstream conclusions, remain limited. We introduce Confidence-Driven Inference: a method that combines LLM annotations and LLM confidence indicators to strategically select which human annotations should be collected, with the goal of producing accurate statistical estimates and provably valid confidence intervals while reducing the number of human annotations needed. Our approach comes with safeguards against LLM annotations of poor quality, guaranteeing that the conclusions will be both valid and no less accurate than if we only relied on human annotations. We demonstrate the effectiveness of Confidence-Driven Inference over baselines in statistical estimation tasks across three CSS settings--text politeness, stance, and bias--reducing the needed number of human annota
Quantum process tomography of each directly implementable quantum gate used in the IBM quantum processors is performed to compute gate error in order to check viability of complex quantum operations in the superconductivity-based quantum computers introduced by IBM and to compare the quality of these gates with the corresponding gates implemented using other technologies. Quantum process tomography (QPT) of C-NOT gates have been performed for three configurations available in IBM QX4 processor. For all the other allowed gates QPT have been performed for every allowed position (i.e., by placing the gates in different qubit lines) for IBM QX4 architecture, and thus, gate fidelities are obtained for both single-qubit and 2-qubit gates. Gate fidelities are observed to be lower than the corresponding values obtained in the other technologies, like NMR. Further, gate fidelities for all the single-qubit gates are obtained for IBM QX2 architecture by placing the gates in the third qubit line ($q[2]$). It's observed that the IBM QX4 architecture yields better gate fidelity compared to IBM QX2 in all cases except the case of $\operatorname{Y}$ gate as far as the gate fidelity corresponding t
The presented algorithms for segmentation and tracking follow a 3-step approach where we detect, track and finally segment nuclei. In the preprocessing phase, we detect centroids of the cell nuclei using a convolutional neural network (CNN) for the 2D images and a Laplacian-of-Gaussian Scale Space Maximum Projection approach for the 3D data sets. Tracking was performed in a backwards fashion on the predicted seed points, i.e., starting at the last frame and sequentially connecting corresponding objects until the first frame was reached. Correspondences were identified by propagating detections of a frame t to its preceding frame t-1 and by combining redundant detections using a hierarchical clustering approach. The tracked centroids were then used as input to variants of the seeded watershed algorithm to obtain the final segmentation.
A tree view or tree navigator is used to display hierarchical data organized in the form of a tree. In a tree structure there are parent and child nodes. The child nodes may further have descendants to n levels. There are many methods to make the navigation easy. Some of these are expanding and collapsing branches, splitting the tree, displaying a parent node in a separate tree, zooming branches, scrolling in various directions etc. It is still a difficult exercise to handle large trees efficiently. The effort still continues to manage large number of nodes with faster speed, greater control, user friendliness and aesthetics. This article illustrates five inventions on tree navigators selected from US patent database. Each of them tries to solve various problems relating to the tree navigator in different ways. Each invention is also analyzed from a TRIZ perspective.
Software engineering is knowledge-intensive work, and how to manage software engineering knowledge has received much attention. This systematic review identifies empirical studies of knowledge management initiatives in software engineering, and discusses the concepts studied, the major findings, and the research methods used. Seven hundred and sixty-two articles were identified, of which 68 were studies in an industry context. Of these, 29 were empirical studies and 39 reports of lessons learned. More than half of the empirical studies were case studies. The majority of empirical studies relate to technocratic and behavioural aspects of knowledge management, while there are few studies relating to economic, spatial and cartographic approaches. A finding reported across multiple papers was the need to not focus exclusively on explicit knowledge, but also consider tacit knowledge. We also describe implications for research and for practice.
The dialog boxes are useful in case of displaying warnings, errors, confirmations etc. in special situations. A typical dialog box is displayed in a small window with some text message along with a few options for the user to select. However, there are certain difficulties associated in programming and implementing a conventional dialog box, such as, severe programming effort, rigidity of the hard coded message, obscuring screen space and so on. There is a need to overcome these difficulties of the dialog box to make them more efficient and useful. The modality of the dialog boxes also creates some limitations. While modal dialog boxes needs to be closed explicitly by the user, modeless dialog boxes can grow in number and become difficult to control. Thus, an ideal dialog box should be deprived of all the above-mentioned drawbacks. The dialog box should not obscure the screen. The user should be able open multiple dialog boxes but without obscuring the screen. This article analyses 5 interesting inventions on dialog boxes selected from US Patent database. Each invention tries to overcome some limitations of a conventional dialog box and provides some innovative features. Each solut
Searches for phrases and word sets in large text arrays by means of additional indexes are considered. Their use may reduce the query-processing time by an order of magnitude in comparison with standard inverted files.
We present a new technique to constrain the gravitational potential of a galaxy from the observed stellar mass surface density alone under a number of assumptions. It uses the classical Eddington Inversion Method to compute the phase-space distribution function (DF) needed for the stars to reside in a given gravitational potential. In essence, each potential defines a set of density profiles, and it is the expansion of the observed profile in this database that provides the DF. If the required DF becomes negative then the potential is inconsistent with the observed stars and can be discarded. It is particularly well-suited for analyzing low-mass low surface brightness galaxies, where photometric but not spectroscopic data can be obtained. The recently discovered low surface brightness galaxy 'Nube' was used to showcase its application. For the observed Nube's stellar core to be reproduced with non-negative DF, cuspy NFW (Navarro, Frenk, and White) potentials are highly disfavored compared with potentials having cores (Schuster-Plummer or rho-230). The method assumes the stellar system to have spherical symmetry and isotropic velocity distribution, however, we discuss simple extensi
As NASA's New Horizons spacecraft exits the Solar System bound for interstellar space, it has traveled so far that the nearest stars have shifted markedly from their positions seen from Earth. We demonstrated this by imaging the Proxima Centauri and Wolf 359 fields from Earth and New Horizons on 2020 April 23, when the spacecraft was 47.1 au distant. The observed parallaxes for Proxima Centauri and Wolf 359 are $32.4''$ and $15.7'',$ respectively. These measurements are not of research grade, but directly seeing large stellar parallaxes between two widely separated simultaneous observers is vividly educational. Using the New Horizons positions of the two stars alone, referenced to the three-dimensional model of the solar neighborhood constructed from Gaia DR3 astrometry, further provides the spacecraft spatial position relative to nearby stars with 0.44 au accuracy. The range to New Horizons from the Solar System barycenter is recovered to 0.27 au accuracy, and its angular direction to $0.4^\circ$ accuracy, when compared to the precise values from NASA Deep Space Network tracking. This is the first time optical stellar astrometry has been used to determine the three-dimensional loc
In this work, we demonstrate that the Hindmarsh-Rose model subjected to additive white noise exhibits birhythmicity. Specifically, the system fluctuates between two distinct bursting attractors characterized by different numbers of spikes. This behavior is observed not only within the bistable region bounded by two saddle-node bifurcations of limit cycles but also beyond these boundaries. This phenomenon is associated with the ghost effect, typically observed near deterministic saddle-node bifurcations. We map the region of stochastic birhythmicity in terms of the noise intensity and a key deterministic parameter that controls the dynamics of fast ion channels. To provide an analytical foundation, we introduce a simple stochastic model with a single saddle-node bifurcation. In this model, stochastic birhythmicity is similarly characterized as a function of noise intensity and the control parameter.
As video games continue to evolve, understanding what drives player enjoyment remains a key challenge. Player reviews provide valuable insights, but their unstructured nature makes large-scale analysis difficult. This study applies generative AI and machine learning, leveraging Microsoft Phi-4 small language model (SLM) and Google Cloud, to quantify and analyze game reviews from Steam and Meta Quest stores. The approach converts qualitative feedback into structured data, enabling comprehensive evaluation of key game design elements, monetization models, and platform-specific trends. The findings reveal distinct patterns in player preferences across PC and VR games, highlighting factors that contribute to higher player enjoyment. By using Google Cloud for large-scale data storage and processing, this study establishes a scalable framework for game review analysis. The study's insights offer actionable guidance for game developers, helping optimize game mechanics, pricing strategies, and player engagement.
Manifold visualisation techniques are commonly used to visualise high-dimensional datasets in physical sciences. In this paper we apply a recently introduced manifold visualisation method, called Slise, on datasets from physics and chemistry. Slisemap combines manifold visualisation with explainable artificial intelligence. Explainable artificial intelligence is used to investigate the decision processes of black box machine learning models and complex simulators. With Slisemap we find an embedding such that data items with similar local explanations are grouped together. Hence, Slisemap gives us an overview of the different behaviours of a black box model. This makes Slisemap into a supervised manifold visualisation method, where the patterns in the embedding reflect a target property. In this paper we show how Slisemap can be used and evaluated on physical data and that Slisemap is helpful in finding meaningful information on classification and regression models trained on these datasets.
Quantum Candies or Qandies provide us with a lucid way of understanding the concepts of quantum information and quantum science in the language of candies. The critical idea of qandies is intuitively depicting quantum science to the general public, making sense as most of the research in this domain is funded by the taxpayers. The qandies model is already used to explain the essential concepts of quantum science and quantum cryptography. However, teleportation and related concepts are yet to be explained. Motivated by this fact, we investigate and extend the idea of Jacobs and Lin-Mor-Shapira to explain teleportation using qandies. Here, we explicitly design the teleportation protocol and perform a circuit model using qandy gates. The protocol is successful when the correlated qandies are appropriately pre-shared and use of some local operations at both ends. The model we develop can be a valuable tool for science and engineering educators who want to help the general public to gain more insights into quantum science and technology.
Brain age is the estimate of biological age derived from neuroimaging datasets using machine learning algorithms. Increasing brain age with respect to chronological age can reflect increased vulnerability to neurodegeneration and cognitive decline. In this paper, we study NeuroVNN, based on coVariance neural networks, as a paradigm for foundation model for the brain age prediction application. NeuroVNN is pre-trained as a regression model on healthy population to predict chronological age using cortical thickness features and fine-tuned to estimate brain age in different neurological contexts. Importantly, NeuroVNN adds anatomical interpretability to brain age and has a `scale-free' characteristic that allows its transference to datasets curated according to any arbitrary brain atlas. Our results demonstrate that NeuroVNN can extract biologically plausible brain age estimates in different populations, as well as transfer successfully to datasets of dimensionalities distinct from that for the dataset used to train NeuroVNN.
The popular frameworks for self-supervised learning of speech representations have largely focused on frame-level masked prediction of speech regions. While this has shown promising downstream task performance for speech recognition and related tasks, this has largely ignored factors of speech that are encoded at coarser level, like characteristics of the speaker or channel that remain consistent through-out a speech utterance. In this work, we propose a framework for Learning Disentangled Self Supervised (termed as Learn2Diss) representations of speech, which consists of frame-level and an utterance-level encoder modules. The two encoders are initially learned independently, where the frame-level model is largely inspired by existing self supervision techniques, thereby learning pseudo-phonemic representations, while the utterance-level encoder is inspired by constrastive learning of pooled embeddings, thereby learning pseudo-speaker representations. The joint learning of these two modules consists of disentangling the two encoders using a mutual information based criterion. With several downstream evaluation experiments, we show that the proposed Learn2Diss achieves state-of-the-
Sample selection models are a widely used approach for correcting bias caused by data that are missing not at random. Their formulation requires specifying the variables that influence the outcome and those that drive the selection process. This specification is often based on expert knowledge, which can result in the inclusion of irrelevant variables or the omission of important ones. Moreover, to avoid inferential problems such as practical non-identifiability, practitioners frequently impose exclusion restrictions, that is, model specifications in which certain variables predict selection but have no effect on the outcome of interest. A recent proposal employs adaptive LASSO to select the variables that enter into the outcome and selection equations, but its performance depends on the so-called covariance assumption, which can be violated in small to moderate samples. To address these challenges, we propose two families of spike-and-slab priors to conduct Bayesian variable selection in sample selection models. These prior structures allow for constructing a Gibbs sampler with tractable conditionals, which is scalable to the dimensions of practical interest. We illustrate the per
Synchronization dynamics is a phenomenon of great interest in many fields of science. One of the most important fields is neuron dynamics, as synchronization in certain regions of the brain is related to some of the most common mental illnesses. To study the impact of the network heterogeneity in the neuronal synchronization, we analyze a small-world network of non-identical Chialvo neurons that are electrically coupled. We introduce a mismatch in one of the model parameters to introduce the heterogeneity of the network. Our study examines the effects of this parameter mismatch, the noise intensity in the stochastic model, and the coupling strength between neurons on synchronization and firing frequency. We have identified critical values of noise intensity, parameter mismatch, and rewiring probability that facilitate effective synchronization within the network. Furthermore, we observe that the balance between excitatory and inhibitory connections plays a crucial role in achieving global synchronization. Our findings offer insights into the mechanisms driving synchronization dynamics in complex neuron networks.
We investigate the synchronization between two neurons using the stochastic version of the map-based Chialvo model. To simulate non-identical neurons, a mismatch is introduced in one of the main parameters of the model. Subsequently, the synchronization of the neurons is studied as a function of this mismatch, the noise introduced in the stochastic model, and the coupling strength between the neurons. We propose the simplest neuron network for study, as its analysis is more straightforward and does not compromise generality. Within this network, two nonidentical neuron maps are electrically coupled. In order to understand if specific behaviors affect the global behavior of the system, we consider different cases related to the behavior of the neurons (chaotic or periodic). Furthermore, we study how variations in model parameters affect the firing frequency in all cases. Additionally, we consider that the two neurons have both excitatory and inhibitory couplings. Consequently, we identify critical values of noise and mismatch for achieving satisfactory synchronization between the neurons in both cases. Finally, we conjecture that the results are of a general nature and are applicabl
Large language models (LLMs) have shown impressive capabilities in generating program code, opening exciting opportunities for applying program synthesis to games. In this work, we explore the potential of LLMs to directly synthesize usable code for a wide range of gaming applications, focusing on two programming languages, Python and Java. We use an evolutionary hill-climbing algorithm, where the mutations and seeds of the initial programs are controlled by LLMs. For Python, the framework covers various game-related tasks, including five miniature versions of Atari games, ten levels of Baba is You, an environment inspired by Asteroids, and a maze generation task. For Java, the framework contains 12 games from the TAG tabletop games framework. Across 29 tasks, we evaluated 12 language models for Python and 8 for Java. Our findings suggest that the performance of LLMs depends more on the task than on model size. While larger models generate more executable programs, these do not always result in higher-quality solutions but are much more expensive. No model has a clear advantage, although on any specific task, one model may be better. Trying many models on a problem and using the be