Large language models (LLMs) have shown high agreement with human raters across a variety of tasks, demonstrating potential to ease the challenges of human data collection. In computational social science (CSS), researchers are increasingly leveraging LLM annotations to complement slow and expensive human annotations. Still, guidelines for collecting and using LLM annotations, without compromising the validity of downstream conclusions, remain limited. We introduce Confidence-Driven Inference: a method that combines LLM annotations and LLM confidence indicators to strategically select which human annotations should be collected, with the goal of producing accurate statistical estimates and provably valid confidence intervals while reducing the number of human annotations needed. Our approach comes with safeguards against LLM annotations of poor quality, guaranteeing that the conclusions will be both valid and no less accurate than if we only relied on human annotations. We demonstrate the effectiveness of Confidence-Driven Inference over baselines in statistical estimation tasks across three CSS settings--text politeness, stance, and bias--reducing the needed number of human annota
Software engineering is knowledge-intensive work, and how to manage software engineering knowledge has received much attention. This systematic review identifies empirical studies of knowledge management initiatives in software engineering, and discusses the concepts studied, the major findings, and the research methods used. Seven hundred and sixty-two articles were identified, of which 68 were studies in an industry context. Of these, 29 were empirical studies and 39 reports of lessons learned. More than half of the empirical studies were case studies. The majority of empirical studies relate to technocratic and behavioural aspects of knowledge management, while there are few studies relating to economic, spatial and cartographic approaches. A finding reported across multiple papers was the need to not focus exclusively on explicit knowledge, but also consider tacit knowledge. We also describe implications for research and for practice.
The presented algorithms for segmentation and tracking follow a 3-step approach where we detect, track and finally segment nuclei. In the preprocessing phase, we detect centroids of the cell nuclei using a convolutional neural network (CNN) for the 2D images and a Laplacian-of-Gaussian Scale Space Maximum Projection approach for the 3D data sets. Tracking was performed in a backwards fashion on the predicted seed points, i.e., starting at the last frame and sequentially connecting corresponding objects until the first frame was reached. Correspondences were identified by propagating detections of a frame t to its preceding frame t-1 and by combining redundant detections using a hierarchical clustering approach. The tracked centroids were then used as input to variants of the seeded watershed algorithm to obtain the final segmentation.
Quantum process tomography of each directly implementable quantum gate used in the IBM quantum processors is performed to compute gate error in order to check viability of complex quantum operations in the superconductivity-based quantum computers introduced by IBM and to compare the quality of these gates with the corresponding gates implemented using other technologies. Quantum process tomography (QPT) of C-NOT gates have been performed for three configurations available in IBM QX4 processor. For all the other allowed gates QPT have been performed for every allowed position (i.e., by placing the gates in different qubit lines) for IBM QX4 architecture, and thus, gate fidelities are obtained for both single-qubit and 2-qubit gates. Gate fidelities are observed to be lower than the corresponding values obtained in the other technologies, like NMR. Further, gate fidelities for all the single-qubit gates are obtained for IBM QX2 architecture by placing the gates in the third qubit line ($q[2]$). It's observed that the IBM QX4 architecture yields better gate fidelity compared to IBM QX2 in all cases except the case of $\operatorname{Y}$ gate as far as the gate fidelity corresponding t
A tree view or tree navigator is used to display hierarchical data organized in the form of a tree. In a tree structure there are parent and child nodes. The child nodes may further have descendants to n levels. There are many methods to make the navigation easy. Some of these are expanding and collapsing branches, splitting the tree, displaying a parent node in a separate tree, zooming branches, scrolling in various directions etc. It is still a difficult exercise to handle large trees efficiently. The effort still continues to manage large number of nodes with faster speed, greater control, user friendliness and aesthetics. This article illustrates five inventions on tree navigators selected from US patent database. Each of them tries to solve various problems relating to the tree navigator in different ways. Each invention is also analyzed from a TRIZ perspective.
The dialog boxes are useful in case of displaying warnings, errors, confirmations etc. in special situations. A typical dialog box is displayed in a small window with some text message along with a few options for the user to select. However, there are certain difficulties associated in programming and implementing a conventional dialog box, such as, severe programming effort, rigidity of the hard coded message, obscuring screen space and so on. There is a need to overcome these difficulties of the dialog box to make them more efficient and useful. The modality of the dialog boxes also creates some limitations. While modal dialog boxes needs to be closed explicitly by the user, modeless dialog boxes can grow in number and become difficult to control. Thus, an ideal dialog box should be deprived of all the above-mentioned drawbacks. The dialog box should not obscure the screen. The user should be able open multiple dialog boxes but without obscuring the screen. This article analyses 5 interesting inventions on dialog boxes selected from US Patent database. Each invention tries to overcome some limitations of a conventional dialog box and provides some innovative features. Each solut
Searches for phrases and word sets in large text arrays by means of additional indexes are considered. Their use may reduce the query-processing time by an order of magnitude in comparison with standard inverted files.
As NASA's New Horizons spacecraft exits the Solar System bound for interstellar space, it has traveled so far that the nearest stars have shifted markedly from their positions seen from Earth. We demonstrated this by imaging the Proxima Centauri and Wolf 359 fields from Earth and New Horizons on 2020 April 23, when the spacecraft was 47.1 au distant. The observed parallaxes for Proxima Centauri and Wolf 359 are $32.4''$ and $15.7'',$ respectively. These measurements are not of research grade, but directly seeing large stellar parallaxes between two widely separated simultaneous observers is vividly educational. Using the New Horizons positions of the two stars alone, referenced to the three-dimensional model of the solar neighborhood constructed from Gaia DR3 astrometry, further provides the spacecraft spatial position relative to nearby stars with 0.44 au accuracy. The range to New Horizons from the Solar System barycenter is recovered to 0.27 au accuracy, and its angular direction to $0.4^\circ$ accuracy, when compared to the precise values from NASA Deep Space Network tracking. This is the first time optical stellar astrometry has been used to determine the three-dimensional loc
We present a new technique to constrain the gravitational potential of a galaxy from the observed stellar mass surface density alone under a number of assumptions. It uses the classical Eddington Inversion Method to compute the phase-space distribution function (DF) needed for the stars to reside in a given gravitational potential. In essence, each potential defines a set of density profiles, and it is the expansion of the observed profile in this database that provides the DF. If the required DF becomes negative then the potential is inconsistent with the observed stars and can be discarded. It is particularly well-suited for analyzing low-mass low surface brightness galaxies, where photometric but not spectroscopic data can be obtained. The recently discovered low surface brightness galaxy 'Nube' was used to showcase its application. For the observed Nube's stellar core to be reproduced with non-negative DF, cuspy NFW (Navarro, Frenk, and White) potentials are highly disfavored compared with potentials having cores (Schuster-Plummer or rho-230). The method assumes the stellar system to have spherical symmetry and isotropic velocity distribution, however, we discuss simple extensi
In this work, we demonstrate that the Hindmarsh-Rose model subjected to additive white noise exhibits birhythmicity. Specifically, the system fluctuates between two distinct bursting attractors characterized by different numbers of spikes. This behavior is observed not only within the bistable region bounded by two saddle-node bifurcations of limit cycles but also beyond these boundaries. This phenomenon is associated with the ghost effect, typically observed near deterministic saddle-node bifurcations. We map the region of stochastic birhythmicity in terms of the noise intensity and a key deterministic parameter that controls the dynamics of fast ion channels. To provide an analytical foundation, we introduce a simple stochastic model with a single saddle-node bifurcation. In this model, stochastic birhythmicity is similarly characterized as a function of noise intensity and the control parameter.
As video games continue to evolve, understanding what drives player enjoyment remains a key challenge. Player reviews provide valuable insights, but their unstructured nature makes large-scale analysis difficult. This study applies generative AI and machine learning, leveraging Microsoft Phi-4 small language model (SLM) and Google Cloud, to quantify and analyze game reviews from Steam and Meta Quest stores. The approach converts qualitative feedback into structured data, enabling comprehensive evaluation of key game design elements, monetization models, and platform-specific trends. The findings reveal distinct patterns in player preferences across PC and VR games, highlighting factors that contribute to higher player enjoyment. By using Google Cloud for large-scale data storage and processing, this study establishes a scalable framework for game review analysis. The study's insights offer actionable guidance for game developers, helping optimize game mechanics, pricing strategies, and player engagement.
Quantum Candies or Qandies provide us with a lucid way of understanding the concepts of quantum information and quantum science in the language of candies. The critical idea of qandies is intuitively depicting quantum science to the general public, making sense as most of the research in this domain is funded by the taxpayers. The qandies model is already used to explain the essential concepts of quantum science and quantum cryptography. However, teleportation and related concepts are yet to be explained. Motivated by this fact, we investigate and extend the idea of Jacobs and Lin-Mor-Shapira to explain teleportation using qandies. Here, we explicitly design the teleportation protocol and perform a circuit model using qandy gates. The protocol is successful when the correlated qandies are appropriately pre-shared and use of some local operations at both ends. The model we develop can be a valuable tool for science and engineering educators who want to help the general public to gain more insights into quantum science and technology.
The popular frameworks for self-supervised learning of speech representations have largely focused on frame-level masked prediction of speech regions. While this has shown promising downstream task performance for speech recognition and related tasks, this has largely ignored factors of speech that are encoded at coarser level, like characteristics of the speaker or channel that remain consistent through-out a speech utterance. In this work, we propose a framework for Learning Disentangled Self Supervised (termed as Learn2Diss) representations of speech, which consists of frame-level and an utterance-level encoder modules. The two encoders are initially learned independently, where the frame-level model is largely inspired by existing self supervision techniques, thereby learning pseudo-phonemic representations, while the utterance-level encoder is inspired by constrastive learning of pooled embeddings, thereby learning pseudo-speaker representations. The joint learning of these two modules consists of disentangling the two encoders using a mutual information based criterion. With several downstream evaluation experiments, we show that the proposed Learn2Diss achieves state-of-the-
Counterfactual quantum communication is one of the most interesting facets of quantum communication, allowing two parties to communicate without any transmission of quantum or classical particles between the parties involved in the communication process. This aspect of quantum communication originates from the interaction-free measurements where the chained quantum Zeno effect plays an important role. Here, we propose a new counterfactual quantum communication protocol for transmitting an entangled state from a pair of electrons to two independent photons. Interestingly, the protocol proposed here shows that the counterfactual method can be employed to transfer information from house qubits to flying qubits. Following this, we show that the protocol finds uses in building quantum repeaters leading to a counterfactual quantum network, enabling counterfactual communication over a linear quantum network.
Large language models (LLMs) have shown impressive capabilities in generating program code, opening exciting opportunities for applying program synthesis to games. In this work, we explore the potential of LLMs to directly synthesize usable code for a wide range of gaming applications, focusing on two programming languages, Python and Java. We use an evolutionary hill-climbing algorithm, where the mutations and seeds of the initial programs are controlled by LLMs. For Python, the framework covers various game-related tasks, including five miniature versions of Atari games, ten levels of Baba is You, an environment inspired by Asteroids, and a maze generation task. For Java, the framework contains 12 games from the TAG tabletop games framework. Across 29 tasks, we evaluated 12 language models for Python and 8 for Java. Our findings suggest that the performance of LLMs depends more on the task than on model size. While larger models generate more executable programs, these do not always result in higher-quality solutions but are much more expensive. No model has a clear advantage, although on any specific task, one model may be better. Trying many models on a problem and using the be
Manifold visualisation techniques are commonly used to visualise high-dimensional datasets in physical sciences. In this paper we apply a recently introduced manifold visualisation method, called Slise, on datasets from physics and chemistry. Slisemap combines manifold visualisation with explainable artificial intelligence. Explainable artificial intelligence is used to investigate the decision processes of black box machine learning models and complex simulators. With Slisemap we find an embedding such that data items with similar local explanations are grouped together. Hence, Slisemap gives us an overview of the different behaviours of a black box model. This makes Slisemap into a supervised manifold visualisation method, where the patterns in the embedding reflect a target property. In this paper we show how Slisemap can be used and evaluated on physical data and that Slisemap is helpful in finding meaningful information on classification and regression models trained on these datasets.
Analyzing and certifying stability and attractivity of nonlinear systems is a topic of research interest that has been extensively investigated by control theorists and engineers for many years. Despite that, accurately estimating domains of attraction for nonlinear systems remains a challenging task, where available estimation approaches are either conservative or limited to low-dimensional systems. In this work, we propose an iterative approach to accurately underapproximate safe (i.e., state-constrained) domains of attraction for general discrete-time autonomous nonlinear systems. Our approach relies on implicit representations of safe backward reachable sets of safe regions of attraction, where such regions can be be easily constructed using, e.g., quadratic Lyapunov functions. The iterations of our approach are monotonic (in the sense of set inclusion), where each iteration results in a safe region of attraction, given as a sublevel set, that underapproximates the safe domain of attraction. The sublevel set representations of the resulting regions of attraction can be efficiently utilized in verifying the inclusion of given points of interest in the safe domain of attraction.
Sample selection models are a widely used approach for correcting bias caused by data that are missing not at random. Their formulation requires specifying the variables that influence the outcome and those that drive the selection process. This specification is often based on expert knowledge, which can result in the inclusion of irrelevant variables or the omission of important ones. Moreover, to avoid inferential problems such as practical non-identifiability, practitioners frequently impose exclusion restrictions, that is, model specifications in which certain variables predict selection but have no effect on the outcome of interest. A recent proposal employs adaptive LASSO to select the variables that enter into the outcome and selection equations, but its performance depends on the so-called covariance assumption, which can be violated in small to moderate samples. To address these challenges, we propose two families of spike-and-slab priors to conduct Bayesian variable selection in sample selection models. These prior structures allow for constructing a Gibbs sampler with tractable conditionals, which is scalable to the dimensions of practical interest. We illustrate the per
Brain age is the estimate of biological age derived from neuroimaging datasets using machine learning algorithms. Increasing brain age with respect to chronological age can reflect increased vulnerability to neurodegeneration and cognitive decline. In this paper, we study NeuroVNN, based on coVariance neural networks, as a paradigm for foundation model for the brain age prediction application. NeuroVNN is pre-trained as a regression model on healthy population to predict chronological age using cortical thickness features and fine-tuned to estimate brain age in different neurological contexts. Importantly, NeuroVNN adds anatomical interpretability to brain age and has a `scale-free' characteristic that allows its transference to datasets curated according to any arbitrary brain atlas. Our results demonstrate that NeuroVNN can extract biologically plausible brain age estimates in different populations, as well as transfer successfully to datasets of dimensionalities distinct from that for the dataset used to train NeuroVNN.
Synchronization dynamics is a phenomenon of great interest in many fields of science. One of the most important fields is neuron dynamics, as synchronization in certain regions of the brain is related to some of the most common mental illnesses. To study the impact of the network heterogeneity in the neuronal synchronization, we analyze a small-world network of non-identical Chialvo neurons that are electrically coupled. We introduce a mismatch in one of the model parameters to introduce the heterogeneity of the network. Our study examines the effects of this parameter mismatch, the noise intensity in the stochastic model, and the coupling strength between neurons on synchronization and firing frequency. We have identified critical values of noise intensity, parameter mismatch, and rewiring probability that facilitate effective synchronization within the network. Furthermore, we observe that the balance between excitatory and inhibitory connections plays a crucial role in achieving global synchronization. Our findings offer insights into the mechanisms driving synchronization dynamics in complex neuron networks.