Concurrency of transcranial magnetic stimulation with electroencephalography (TMS-EEG) technique is a powerful and challenging methodology for basic research and clinical applications. Aspects considered in experiments for effective TMS-EEG recordings and analysis, including artifact management, data analysis and interpretation and protocols. mini review offers an extensive insight of TMS-EEG methodology in experimental and computational procedures. Case study aims to leverage an openly available, high-quality EEG dataset to delve into the alterations in cortical activity. By applying Intermittent theta-burst stimulation (iTBS) and continuous theta-burst stimulation (cTBS) to the left dorsolateral prefrontal cortex (DLPFC) in healthy individuals, we observe changes in oscillatory patterns within the EEG data. The dataset includes meticulously extracted resting-state EEG recordings, TMS-evoked potential data, and MRI scans. To process these data, we utilized Brainstorm, an open-source Matlab application, which facilitated noise reduction through independent component analysis and signal-space projection techniques. It allowed us to identify, visualize, and analyze TMS-evoked potenti
With the rapid increase of published open datasets, it is crucial to support the open data progress in smart cities while considering the open data quality. In the Czech Republic, and its National Open Data Catalogue (NODC), the open datasets are usually evaluated based on their metadata only, while leaving the content and the adherence to the recommended data structure to the sole responsibility of the data providers. The interoperability of open datasets remains unknown. This paper therefore aims to propose a novel content-aware quality evaluation framework that assesses the quality of open datasets based on five data quality dimensions. With the proposed framework, we provide a fundamental view on the interoperability-oriented data quality of Czech open datasets, which are published in NODC. Our evaluations find that domain-specific open data quality assessments are able to detect data quality issues beyond traditional heuristics used for determining Czech open data quality, increase their interoperability, and thus increase their potential to bring value for the society. The findings of this research are beneficial not only for the case of the Czech Republic, but also can be ap
Ultra-weak photon emission (UPE) from living systems is widely hypothesized to reflect un-derlying self-organization and long-range coordination in biological dynamics. However, distin-guishing biologically driven correlations from trivial stochastic or instrumental effects requires a robust, multi-method framework. In this work, we establish and benchmark a comprehensive anal-ysis pipeline for photon-count time series, combining Distribution Entropy Analysis, Rényi entro-py, Detrended Fluctuation Analysis, its generalization Multifractal Detrended Fluctuation Analysis, and tail-statistics characterization. Surrogate signals constructed from Poisson processes, Fractional Gaussian Noise, and Renewal Processes with power-law waiting times are used to validate sensitivity to memory, intermittency, and multifractality. Across all methods, a coherent hierarchy of dynamical regimes is recovered, demonstrating internal methodological consistency. Application to experimental dark-count data and attenuated coherent-laser emission confirm Poisson-like behavior, establishing an essential statistical baseline for UPE studies. The combined results show that this multi-resolution approach reliab
The emergence of breakthrough artificial intelligence (AI) techniques has led to a renewed focus on how small data settings, i.e., settings with limited information, can benefit from such developments. This includes societal issues such as how best to include under-represented groups in data-driven policy and decision making, or the health benefits of assistive technologies such as wearables. We provide a conceptual overview, in particular contrasting small data with big data, and identify common themes from exemplary case studies and application areas. Potential solutions are described in a more detailed technical overview of current data analysis and modelling techniques, highlighting contributions from different disciplines, such as knowledge-driven modelling from statistics and data-driven modelling from computer science. By linking application settings, conceptual contributions and specific techniques, we highlight what is already feasible and suggest what an agenda for fully leveraging small data might look like.
The development of modern information technologies permits to collect and to analyze huge amounts of statistical data in different spheres of life. The main problem is not to only to collect but to process all relevant information. The purpose of our work is to show the example of intelligent data analysis in such complex and non-formalized field as science. Using the statistical data about scientific periodical it is possible to perform its comprehensive analysis and to solve different practical problems. The combination of various approaches including the statistical analysis, methods of the complex network theory and different techniques that can be used for the concept mapping permits to perform an intelligent data analysis in order to obtain underlying patterns and hidden connections. Results of such analysis can be used for particular practical problems like information retrieval within journal.
Despite technological and medical advances, the detection, interpretation, and treatment of cancer based on imaging data continue to pose significant challenges. These include inter-observer variability, class imbalance, dataset shifts, inter- and intra-tumour heterogeneity, malignancy determination, and treatment effect uncertainty. Given the recent advancements in Generative Adversarial Networks (GANs), data synthesis, and adversarial training, we assess the potential of these technologies to address a number of key challenges of cancer imaging. We categorise these challenges into (a) data scarcity and imbalance, (b) data access and privacy, (c) data annotation and segmentation, (d) cancer detection and diagnosis, and (e) tumour profiling, treatment planning and monitoring. Based on our analysis of 164 publications that apply adversarial training techniques in the context of cancer imaging, we highlight multiple underexplored solutions with research potential. We further contribute the Synthesis Study Trustworthiness Test (SynTRUST), a meta-analysis framework for assessing the validation rigour of medical image synthesis studies. SynTRUST is based on 26 concrete measures of thoro
Dimensionality-reduction methods are a fundamental tool in the analysis of large data sets. These algorithms work on the assumption that the "intrinsic dimension" of the data is generally much smaller than the ambient dimension in which it is collected. Alongside their usual purpose of mapping data into a smaller dimension with minimal information loss, dimensionality-reduction techniques implicitly or explicitly provide information about the dimension of the data set. In this paper, we propose a new statistic that we call the $κ$-profile for analysis of large data sets. The $κ$-profile arises from a dimensionality-reduction optimization problem: namely that of finding a projection into $k$-dimensions that optimally preserves the secants between points in the data set. From this optimal projection we extract $κ,$ the norm of the shortest projected secant from among the set of all normalized secants. This $κ$ can be computed for any $k$; thus the tuple of $κ$ values (indexed by dimension) becomes a $κ$-profile. Algorithms such as the Secant-Avoidance Projection algorithm and the Hierarchical Secant-Avoidance Projection algorithm, provide a computationally feasible means of estimatin
Remote sensing image object detection (RSIOD) aims to identify and locate specific objects within satellite or aerial imagery. However, there is a scarcity of labeled data in current RSIOD datasets, which significantly limits the performance of current detection algorithms. Although existing techniques, e.g., data augmentation and semi-supervised learning, can mitigate this scarcity issue to some extent, they are heavily dependent on high-quality labeled data and perform worse in rare object classes. To address this issue, this paper proposes a layout-controllable diffusion generative model (i.e. AeroGen) tailored for RSIOD. To our knowledge, AeroGen is the first model to simultaneously support horizontal and rotated bounding box condition generation, thus enabling the generation of high-quality synthetic images that meet specific layout and object category requirements. Additionally, we propose an end-to-end data augmentation framework that integrates a diversity-conditioned generator and a filtering mechanism to enhance both the diversity and quality of generated data. Experimental results demonstrate that the synthetic data produced by our method are of high quality and diversit
Multi-view data, that is matched sets of measurements on the same subjects, have become increasingly common with advances in multi-omics technology. Often, it is of interest to find associations between the views that are related to the intrinsic class memberships. Existing association methods cannot directly incorporate class information, while existing classification methods do not take into account between-views associations. In this work, we propose a framework for Joint Association and Classification Analysis of multi-view data (JACA). Our goal is not to merely improve the misclassification rates, but to provide a latent representation of high-dimensional data that is both relevant for the subtype discrimination and coherent across the views. We motivate the methodology by establishing a connection between canonical correlation analysis and discriminant analysis. We also establish the estimation consistency of JACA in high-dimensional settings. A distinct advantage of JACA is that it can be applied to the multi-view data with block-missing structure, that is to cases where a subset of views or class labels is missing for some subjects. The application of JACA to quantify the a
We present a technique for spatiotemporal data analysis called nonlinear Laplacian spectral analysis (NLSA), which generalizes singular spectrum analysis (SSA) to take into account the nonlinear manifold structure of complex data sets. The key principle underlying NLSA is that the functions used to represent temporal patterns should exhibit a degree of smoothness on the nonlinear data manifold M; a constraint absent from classical SSA. NLSA enforces such a notion of smoothness by requiring that temporal patterns belong in low-dimensional Hilbert spaces V_l spanned by the leading l Laplace-Beltrami eigenfunctions on M. These eigenfunctions can be evaluated efficiently in high ambient-space dimensions using sparse graph-theoretic algorithms. Moreover, they provide orthonormal bases to expand a family of linear maps, whose singular value decomposition leads to sets of spatiotemporal patterns at progressively finer resolution on the data manifold. The Riemannian measure of M and an adaptive graph kernel width enhances the capability of NLSA to detect important nonlinear processes, including intermittency and rare events. The minimum dimension of V_l required to capture these features w
Modern large-scale astroparticle setups measure high-energy particles, gamma rays, neutrinos, radio waves, and the recently discovered gravitational waves. Ongoing and future experiments are located worldwide. The data acquired have different formats, storage concepts, and publication policies. Such differences are a crucial point in the era of Big Data and of multi-messenger analysis in astroparticle physics. We propose an open science web platform called ASTROPARTICLE.ONLINE which enables us to publish, store, search, select, and analyze astroparticle data. In the first stage of the project, the following components of a full data life cycle concept are under development: describing, storing, and reusing astroparticle data; software to perform multi-messenger analysis using deep learning; and outreach for students, post-graduate students, and others who are interested in astroparticle physics. Here we describe the concepts of the web platform and the first obtained results, including the meta data structure for astroparticle data, data analysis by using convolution neural networks, description of the binary data, and the outreach platform for those interested in astroparticle phy
The analysis of the leukemia data from Whitehead/MIT group is a discriminant analysis (also called a supervised learning). Among thousands of genes whose expression levels are measured, not all are needed for discriminant analysis: a gene may either not contribute to the separation of two types of tissues/cancers, or it may be redundant because it is highly correlated with other genes. There are two theoretical frameworks in which variable selection (or gene selection in our case) can be addressed. The first is model selection, and the second is model averaging. We have carried out model selection using Akaike information criterion and Bayesian information criterion with logistic regression (discrimination, prediction, or classification) to determine the number of genes that provide the best model. These model selection criteria set upper limits of 22-25 and 12-13 genes for this data set with 38 samples, and the best model consists of only one (no.4847, zyxin) or two genes. We have also carried out model averaging over the best single-gene logistic predictors using three different weights: maximized likelihood, prediction rate on training set, and equal weight. We have observed tha
Remote sensing vision tasks require extensive labeled data across multiple, interconnected domains. However, current generative data augmentation frameworks are task-isolated, i.e., each vision task requires training an independent generative model, and ignores the modeling of geographical information and spatial constraints. To address these issues, we propose \textbf{TerraGen}, a unified layout-to-image generation framework that enables flexible, spatially controllable synthesis of remote sensing imagery for various high-level vision tasks, e.g., detection, segmentation, and extraction. Specifically, TerraGen introduces a geographic-spatial layout encoder that unifies bounding box and segmentation mask inputs, combined with a multi-scale injection scheme and mask-weighted loss to explicitly encode spatial constraints, from global structures to fine details. Also, we construct the first large-scale multi-task remote sensing layout generation dataset containing 45k images and establish a standardized evaluation protocol for this task. Experimental results show that our TerraGen can achieve the best generation image quality across diverse tasks. Additionally, TerraGen can be used as
A phenomenological analysis of lifetimes of bottom and charmed hadrons within the framework of the heavy quark expansion is performed. The baryon matrix element is evaluated using the bag model and the nonrelativistic quark model. We find that bottom-baryon lifetimes follow the pattern $τ(Ω_b)\simeqτ(Ξ_b^-)>τ(Λ_b)\simeqτ(Ξ_b^0)$. However, neither the lifetime ratio $τ(Λ_b)/τ( B_d)$ nor the absolute decay rates of the $Λ_b$ baryon and $B$ mesons can be explained. One way of solving both difficulties is to allow the presence of linear $1/m_Q$ corrections by scaling the inclusive nonleptonic width with the fifth power of the hadron mass $m_{H_Q}$ rather than the heavy quark mass $m_Q$. The hierarchy of bottom baryon lifetimes is dramatically modified to $τ(Λ_b)>τ(Ξ_b^-)>τ(Ξ_b^0)>τ( Ω_b)$: The longest-lived $Ω_b$ among bottom baryons in the OPE prescription now becomes shortest-lived. The replacement of $m_Q$ by $m_{H_Q}$ in nonleptonic widths is natural and justified in the PQCD-based factorization approach formulated in terms of hadron-level kinematics. For inclusive charmed baryon decays, we argue that since the heavy quark expansion does not converge, local duality cann
We argue the usefulness of Gaifman graphs of first-order relational structures as an exploratory data analysis tool. We illustrate our approach with cases where the modular decompositions of these graphs reveal interesting facts about the data. Then, we introduce generalized notions of Gaifman graphs, enhanced with quantitative information, to which we can apply more general, existing decomposition notions via 2-structures; thus enlarging the analytical capabilities of the scheme. The very essence of Gaifman graphs makes this approach immediately appropriate for the multirelational data framework.
This document defines the high level metadata necessary to describe the physical parameter space of observed or simulated astronomical data sets, such as 2D-images, data cubes, X-ray event lists, IFU data, etc.. The Characterisation data model is an abstraction which can be used to derive a structured description of any relevant data and thus to facilitate its discovery and scientific interpretation. The model aims at facilitating the manipulation of heterogeneous data in any VO framework or portal. A VO Characterisation instance can include descriptions of the data axes, the range of coordinates covered by the data, and details of the data sampling and resolution on each axis. These descriptions should be in terms of physical variables, independent of instrumental signatures as far as possible. Implementations of this model has been described in the IVOA Note available at: http://www.ivoa.net/Documents/latest/ImplementationCharacterisation.html Utypes derived from this version of the UML model are listed and commented in the following IVOA Note: http://www.ivoa.net/Documents/latest/UtypeListCharacterisationDM.html An XML schema has been build up from the UML model and is available
An extreme-point symmetric mode decomposition (ESMD) method is proposed to improve the Hilbert-Huang Transform (HHT) through the following prospects: (1) The sifting process is implemented by the aid of 1, 2, 3 or more inner interpolating curves, which classifies the methods into ESMD_I, ESMD_II, ESMD_III, and so on; (2) The last residual is defined as an optimal curve possessing a certain number of extreme points, instead of general trend with at most one extreme point, which allows the optimal sifting times and decompositions; (3) The extreme-point symmetry is applied instead of the envelop symmetry; (4) The data-based direct interpolating approach is developed to compute the instantaneous frequency and amplitude. One advantage of the ESMD method is to determine an optimal global mean curve in an adaptive way which is better than the common least-square method and running-mean approach; another one is to determine the instantaneous frequency and amplitude in a direct way which is better than the Hilbert-spectrum method. These will improve the adaptive analysis of the data from atmospheric and oceanic sciences, informatics, economics, ecology, medicine, seismology, and so on.
The Polarimetric and Helioseismic Imager (PHI) is the first deep-space solar spectropolarimeter, on-board the Solar Orbiter (SO) space mission. It faces: stringent requirements on science data accuracy, a dynamic environment, and severe limitations on telemetry volume. SO/PHI overcomes these restrictions through on-board instrument calibration and science data reduction, using dedicated firmware in FPGAs. This contribution analyses the accuracy of a data processing pipeline by comparing the results obtained with SO/PHI hardware to a reference from a ground computer. The results show that for the analysed pipeline the error introduced by the firmware implementation is well below the requirements of SO/PHI.
We present the design and test results of two optical data transmission ASICs for the High-Luminosity LHC (HL-LHC) experiments. These ASICs include a two-channel serializer (LOCs2) and a single-channel Vertical Cavity Surface Emitting Laser (VCSEL) driver (LOCld1V2). Both ASICs are fabricated in a commercial 0.25-um Silicon-on-Sapphire (SoS) CMOS technology and operate at a data rate up to 8 Gbps per channel. The power consumption of LOCs2 and LOCld1V2 are 1.25 W and 0.27 W at 8-Gbps data rate, respectively. LOCld1V2 has been verified meeting the radiation-tolerance requirements for HL-LHC experiments.
The ratio of the depths of spectral lines is a powerful indicator of the effective temperature. The method based on this analysis is capable of discerning small temperature variations of individual stars. We apply this spectroscopic data analysis to three type of stars, namely an RS CVn type binary system, a young solar-type star and a Cepheid variable. We show that individual LDRs converted into temperature through calibration relations lead to rotational and pulsational modulation of the average surface temperature with amplitudes of 127 K, 48 K and 1466 K in the three types of stars, with average estimated errors of some tens Kelvin degrees.