Marked Temporal Point Process (MTPP) has been well studied to model the event distribution in marked event streams, which can be used to predict the mark and arrival time of the next event. However, existing studies overlook that the distribution of event marks is highly imbalanced in many real-world applications, with some marks being frequent but others rare. The imbalance poses a significant challenge to the performance of the next event prediction, especially for events of rare marks. To address this issue, we propose a thresholding method, which learns thresholds to tune the mark probability normalized by the mark's prior probability to optimize mark prediction, rather than predicting the mark directly based on the mark probability as in existing studies. In conjunction with this method, we predict the mark first and then the time. In particular, we develop a novel neural MTPP model to support effective time sampling and estimation of mark probability without computationally expensive numerical improper integration. Extensive experiments on real-world datasets demonstrate the superior performance of our solution against various baselines for the next event mark and time predic
The emergence of distinct local mark behaviours is becoming increasingly common in the applications of spatial marked point processes. This dynamic highlights the limitations of existing global mark correlation functions in accurately identifying the true patterns of mark associations/variations among points as distinct mark behaviours might dominate one another, giving rise to an incomplete understanding of mark associations. In this paper, we introduce a family of local indicators of mark association (LIMA) functions for spatial marked point processes. These functions are defined on general state spaces and can include marks that are either real-valued or function-valued. Unlike global mark correlation functions, which are often distorted by the existence of distinct mark behaviours, LIMA functions reliably identify all types of mark associations and variations among points. Additionally, they accurately determine the interpoint distances where individual points show significant mark associations. Through simulation studies, featuring various scenarios, and four real applications in forestry, criminology, and urban mobility, we study spatial marked point processes in $\R^2$ and o
This paper presents a framework for causal inference in the presence of censored data,where the failure time is marked by a continuous variable referred to as a mark.The mark is observed after treatment and is not meaningful when the failure time is censored. In addition, due to the continuous nature of the marks, observations at each given mark are sparse. These facts make the identification and estimation of causality a challenging task. To address these issues, we define a new mark-specific treatment effect within the potential outcomes framework and characterize its identifying conditions. We then propose a local smoothing estimator for the causal effects and establish its asymptotic properties. We further develop testing methods to evaluate whether the treatment has an effect on the failure time when controlling the values of the mark at certain points or within a defined interval, and develop a Gaussian approximation method to obtain the critical values. We evaluate our method using simulation studies as well as a real dataset from the Antibody Mediated Prevention trials.
Software watermarking allows for embedding a mark into a piece of code, such that any attempt to remove the mark will render the code useless. Provably secure watermarking schemes currently seems limited to programs computing various cryptographic operations, such as evaluating pseudorandom functions (PRFs), signing messages, or decrypting ciphertexts (the latter often going by the name ``traitor tracing''). Moreover, each of these watermarking schemes has an ad-hoc construction of its own. We observe, however, that many cryptographic objects are used as building blocks in larger protocols. We ask: just as we can compose building blocks to obtain larger protocols, can we compose watermarking schemes for the building blocks to obtain watermarking schemes for the larger protocols? We give an affirmative answer to this question, by precisely formulating a set of requirements that allow for composing watermarking schemes. We use our formulation to derive a number of applications.
Spatial phenomena in environmental and biological contexts often involve events that are unevenly distributed across space and carry attributes, whose associations/variations are space-dependent. In this paper, we introduce the class of inhomogeneous mark correlation functions, capturing mark associations/variations, while explicitly accounting for the spatial inhomogeneity of events. The proposed functions are designed to quantify how, on average, marks vary or associate with one another as a function of pairwise spatial distances. We develop nonparametric estimators and evaluate their performance through simulation studies covering a range of scenarios with mark association or variation, spanning from nonstationary point patterns without spatial interaction to those characterised by clustering tendencies. Our simulations reveal the shortcomings of traditional methods in the presence of spatial inhomogeneity, underscoring the necessity of our approach. Furthermore, the results show that our estimators accurately identify both the positivity/negativity and effective spatial range for detected mark associations/variations. The proposed inhomogeneous mark correlation functions are th
Marked power spectra provide a computationally efficient way to extract non-Gaussian information from the matter density field using the usual analysis tools developed for the power spectrum without the need for explicit calculation of higher-order correlators. In this work, we explore the optimal form of the mark function used for re-weighting the density field, to maximally constrain cosmology. We show that adding to the mark function or multiplying it by a constant leads to no additional information gain, which significantly reduces our search space for optimal marks. We quantify the information gain of this optimal function and compare it against mark functions previously proposed in the literature. We find that we can gain around $\sim2$ times smaller errors in $σ_8$ and $\sim4$ times smaller errors in $Ω_m$ compared to using the traditional power spectrum alone, an improvement of $\sim60\%$ compared to other proposed marks when applied to the same dataset.
Prescription opioids relieve moderate-to-severe pain after surgery, but overprescription can lead to misuse and overdose. Understanding factors associated with post-surgical opioid refills is crucial for improving pain management and reducing opioid-related harms. Conventional methods often fail to account for refill size or dosage and capture patient risk dynamics. We address this gap by treating dosage as a continuously varying mark for each refill event and proposing a new class of mark-specific proportional hazards models for recurrent events. Our marginal model, developed on the gap-time scale with a dual weighting scheme, accommodates event proximity to dosage of interest while accounting for the informative number of recurrences. We establish consistency and asymptotic normality of the estimator and provide a sandwich variance estimator for robust inference. Simulations show improved finite-sample performance over competing methods. We apply the model to data from the Michigan Surgical Quality Collaborative and Michigan Automated Prescription System. Results show that high BMI, smoking, cancer, and open surgery increase hazards of high-dosage refills, while inpatient surgeri
We construct a bijection between marked bumpless pipedreams with reverse compatible pairs, which are in bijection with not-necessarily-reduced pipedreams. This directly unifies various formulas for Grothendieck polynomials in the literature. Our bijection is a generalization of a variant of the bijection of Gao and Huang in the unmarked, reduced case.
The vertex-edge marking game is played between two players on a graph, $G=(V,E)$, with one player marking vertices and the other marking edges. The players want to minimize/maximize, respectively, the number of marked edges incident to an unmarked vertex. The vertex-edge coloring number for $G$ is the maximum score achievable with perfect play. Brešar et al., [4], give an upper bound of $5$ for the vertex-edge coloring number for finite planar graphs. It is not known whether the bound is tight. In this paper, in response to questions in [4], we show that the vertex-edge coloring number for the infinite regular triangularization of the plane is 4. We also give two general techniques that allow us to calculate the vertex-edge coloring number in many related triangularizations of the plane.
Non-invasive marks, including pigmentation patterns, acquired scars,and genetic mark- ers, are often used to identify individuals in mark-recapture experiments. If animals in a population can be identified from multiple, non-invasive marks then some individuals may be counted twice in the observed data. Analyzing the observed histories without accounting for these errors will provide incorrect inference about the population dynamics. Previous approaches to this problem include modeling data from only one mark and combining estimators obtained from each mark separately assuming that they are independent. Motivated by the analysis of data from the ECOCEAN online whale shark (Rhincodon typus) catalog, we describe a Bayesian method to analyze data from multiple, non-invasive marks that is based on the latent-multinomial model of Link et al. (2010). Further to this, we describe a simplification of the Markov chain Monte Carlo algorithm of Link et al. (2010) that leads to more efficient computation. We present results from the analysis of the ECOCEAN whale shark data and from simulation studies comparing our method with the previous approaches.
We give criteria on the existence of a so-called mark function in the context of marked metric measure spaces (mmm-spaces). If an mmm-space admits a mark function, we call it functionally-marked metric measure space (fmm-space). This is not a closed property in the usual marked Gromov-weak topology, and thus we put particular emphasis on the question under which conditions it carries over to a limit. We obtain criteria for deterministic mmm-spaces as well as random mmm-spaces and mmm-space-valued processes. As an example, our criteria are applied to prove that the tree-valued Fleming-Viot dynamics with mutation and selection from [Depperschmidt, Greven, Pfaffelhuber, Ann. Appl. Probab. '12] admits a mark function at all times, almost surely. Thereby, we fill a gap in a former proof of this fact, which used a wrong criterion. Furthermore, the subspace of fmm-spaces, which is dense and not closed, is investigated in detail. We show that there exists a metric that induces the marked Gromov-weak topology on this subspace and is complete. Therefore, the space of fmm-spaces is a Polish space. We also construct a decomposition into closed sets which are related to the case of uniformly eq
A Bill of Materials (BoM) is a list of all components on a printed circuit board (PCB). Since BoMs are useful for hardware assurance, automatic BoM extraction (AutoBoM) is of great interest to the government and electronics industry. To achieve a high-accuracy AutoBoM process, domain knowledge of PCB text and logos must be utilized. In this study, we discuss the challenges associated with automatic PCB marking extraction and propose 1) a plan for collecting salient PCB marking data, and 2) a framework for incorporating this data for automatic PCB assurance. Given the proposed dataset plan and framework, subsequent future work, implications, and open research possibilities are detailed.
We aim to link random fields and marked point processes and therefore introduce a new class of stochastic processes which are defined on a random set in R^d. Unlike for random fields, the mark covariance function of a marked random set is in general not positive definite. This implies that in many situations the use of simple geostatistical methods appears to be questionable. Surprisingly, for a special class of processes based on Gaussian random fields, we do have positive definiteness for the corresponding mark covariance function and mark correlation function.
A k-digraph is an orientation of a multi-graph that is without loops and contains at most k edges between any pair of distinct vertices. We obtain necessary and sufficient conditions for a sequence of non-negative integers in non-decreasing order to be a sequence of numbers, called marks (k-scores), attached to vertices of a k-digraph. We characterize irreducible mark sequences in k-digraphs and uniquely realizable mark sequences in 2-digraphs.
We show how a quantum walk can be used to find a marked edge or a marked complete subgraph of a complete graph. We employ a version of a quantum walk, the scattering walk, which lends itself to experimental implementation. The edges are marked by adding elements to them that impart a specific phase shift to the particle as it enters or leaves the edge. If the complete graph has N vertices and the subgraph has K vertices, the particle becomes localized on the subgraph in O(N/K) steps. This leads to a quantum search that is quadratically faster than a corresponding classical search. We show how to implement the quantum walk using a quantum circuit and a quantum oracle, which allows us to specify the resource needed for a quantitative comparison of the efficiency of classical and quantum searches -- the number of oracle calls.
Inherent differences in behaviour of individual animal movement can introduce bias into estimates of population parameters derived from mark-recapture data. Additionally, quantifying individual heterogeneity is of considerable interest in it's own right as numerous studies have shown how heterogeneity can drive population dynamics. In this paper we incorporate multiple measures of individual heterogeneity into a multi-state mark-recapture model, using a Beta-Binomial Gibbs sampler using MCMC estimation. We also present a novel Independent Metropolis-Hastings sampler which allows for efficient updating of the hyper-parameters which cannot be updated using Gibbs sampling. We tested the model using simulation studies and applied the model to mark-resight data of North Atlantic humpback whales observed in the Stellwagen Bank National Marine Sanctuary where heterogeneity is present in both sighting probability and site preference. Simulation studies show asymptotic convergence of the posterior distribution for each of the hyper-parameters to true parameter values. In application to humpback whales individual heterogeneity is evident in sighting probability and propensity to use the mari
Classical reinforcement learning (RL) typically seeks a deterministic policy that maximizes the expected sum of a scalar reward. Yet, modern applications such as language model fine-tuning or scientific discovery demand diversity. Existing remedies such as entropy regularization or diversity bonuses often require fragile trade-offs that sacrifice performance for stochasticity or rely on heuristic metrics that can misalign policy rankings. We argue that diversity is more naturally understood as the rational response to uncertainty in the reward. When the reward function is not perfectly known--as is the case with ambiguous preferences or imperfect reward models--committing to a single action can be sub-optimal. Building on this, we propose a fundamental reformulation of the RL objective by replacing the scalar reward with a distribution over reward functions, and applying a non-linear objective over sets of actions. The result is a framework in which calibrated behavioural diversity emerges naturally, remains controllable through the reward function distribution, and is obtained without sacrificing expected reward. Focusing on the contextual bandit setting, we derive a principled gr
A Catalan word is one on the alphabet of positive integers starting with $1$ in which each subsequent letter is at most one more than its predecessor. Let $\mathcal{C}_n$ denote the set of Catalan words of length $n$. In this paper, we give combinatorial proofs of explicit formulas for the sums of several parameter values taken over all the members of $\mathcal{C}_n$. In particular, we find such proofs for the parameters tracking the number of symmetric or $\ell$-valleys, which was previously requested by Baril et al. Further, we find a combinatorial explanation of a related Catalan number identity whose proof was also requested. To carry out our arguments, we consider corresponding statistics on Dyck paths and find the cardinality of certain sets of marked Dyck paths wherein one or more of the steps is distinguished from all others.
Closed optical trajectories in Kerr spacetime are engineered to exhibit a marked lack of symmetry. The eccentricity manifests as a holonomy in gravitational Faraday rotation that can be made arbitrarily large by radial translation of the common location of source and receiver. All trajectories are non-equatorial and include a passage through the equatorial plane at the radial turning point, where the trajectory and pseudo-magnetic field are well-aligned. This, combined with path asymmetry, results in a large gravitational Faraday holonomy that lends itself to experimental measurement. Trajectories that start further away from the singularity pass more closely to the ergosphere, thus transiting a more distorted region of spacetime with concomitant amplification of gravitational Coriolis force.
We show that odd Khovanov homology carries an action of the super Lie algebra $\mathfrak{gl}_{1|1}$, given extra choice of markings on the link. Moreover, we show that this action arises from an action on super $\mathfrak{gl}_{2}$-foams, in the extended-TQFT framework developed by the second author and Vaz; in particular, it extends to tangles. Finally, we relate the action to torsion $\mathbb{Z}/n\mathbb{Z}$ in pretzel links $P(n,n,-n)$. In particular, this shows that all torsion can appear in odd Khovanov homology.