Longitudinal electronic health record (EHR) data offer opportunities to study biomarker trajectories; however, association estimates-the primary inferential target-from standard models designed for regular observation times may be biased by a two-stage hierarchical missingness mechanism. The first stage is the visiting process (informative presence), where encounters occur at irregular times driven by patient health status; the second is the observation process (informative observation), where biomarkers are selectively measured during visits. To address these mechanisms, we propose a unified semiparametric joint modeling framework that simultaneously characterizes the visiting, biomarker observation, and longitudinal outcome processes. Central to this framework is a shared subject-specific Gaussian latent variable that captures unmeasured frailty and induces dependence across all components. We develop a three-stage estimation procedure and establish the consistency and asymptotic normality of our estimators. We also introduce a sequential procedure that imputes missing biomarkers prior to adjusting for irregular visiting and examine its performance. Simulation results demonstrate
Location-based services (LBS) have accumulated extensive human mobility data on diverse behaviors through check-in sequences. These sequences offer valuable insights into users' intentions and preferences. Yet, existing models analyzing check-in sequences fail to consider the semantics contained in these sequences, which closely reflect human visiting intentions and travel preferences, leading to an incomplete comprehension. Drawing inspiration from the exceptional semantic understanding and contextual information processing capabilities of large language models (LLMs) across various domains, we present Mobility-LLM, a novel framework that leverages LLMs to analyze check-in sequences for multiple tasks. Since LLMs cannot directly interpret check-ins, we reprogram these sequences to help LLMs comprehensively understand the semantics of human visiting intentions and travel preferences. Specifically, we introduce a visiting intention memory network (VIMN) to capture the visiting intentions at each record, along with a shared pool of human travel preference prompts (HTPP) to guide the LLM in understanding users' travel preferences. These components enhance the model's ability to extrac
We study the accurate and efficient computation of the expected number of times each state is visited in discrete- and continuous-time Markov chains. To obtain sound accuracy guarantees efficiently, we lift interval iteration and topological approaches known from the computation of reachability probabilities and expected rewards. We further study applications of expected visiting times, including the sound computation of the stationary distribution and expected rewards conditioned on reaching multiple goal states. The implementation of our methods in the probabilistic model checker Storm scales to large systems with millions of states. Our experiments on the quantitative verification benchmark set show that the computation of stationary distributions via expected visiting times consistently outperforms existing approaches - sometimes by several orders of magnitude.
We extend the graph convolutional network method for deep learning on graph data to higher order in terms of neighboring nodes. In order to construct representations for a node in a graph, in addition to the features of the node and its immediate neighboring nodes, we also include more distant nodes in the calculations. In experimenting with a number of publicly available citation graph datasets, we show that this higher order neighbor visiting pays off by outperforming the original model especially when we have a limited number of available labeled data points for the training of the model.
Motivated by an optimal visiting problem, we study a switching mean-field game on a network, where both a decisional and a switching time-variable is at disposal of the agents for what concerns, respectively, the instant to decide and the instant to perform the switch. Every switch between the nodes of the network represents a switch from $0$ to $1$ of one component of the string $p = (p_1,\ldots, p_n)$ which, in the optimal visiting interpretation, gives information on the visited targets, being the targets labeled by $i=1,\ldots, n$. The goal is to reach the final string $(1, \ldots, 1)$ in the final time $T$, minimizing a switching cost also depending on the congestion on the nodes. We prove the existence of a suitable definition of an approximated $\varepsilon$-mean-field equilibrium and then address the passage to the limit when $\varepsilon$ goes to 0.
Now-a-days, cyberattacks are increasing at an unprecedented rate. Phishing is a social engineering attack which has a massive global impact, destroying the financial and economic value of corporations, government sectors and individuals. In phishing, attackers steal users personal information such as username, passwords, debit card information and so on. In order to detect zero-hour attacks and protect end-users from these attacks, various anti-phishing techniques are developed, but the end-users have to visit the websites to know whether they are safe or not, which may lead to infecting their system. In this paper, we propose a method where end-users can detect the genuineness of the sites without visiting them. The proposed method collects legitimate and phishing URLs and extract features from them. The extracted features are given as input to six different classifiers for training and constructing the model. The classifiers used are Naive-Bayes, Logistic Regression, Random Forest,CatBoost, XGBoost and Multilayer perceptron. The method is tested by developing into an extension so that the end-users can use it when browsing. In the browser extension when the user takes the cursor
Let $E=\{e_1,\ldots,e_n\}$ be a set of $C$-oriented disjoint segments in the plane, where $C$ is a given finite set of orientations that spans the plane, and let $s$ and $t$ be two points. %(We also require that for each orientation in $C$, its opposite orientation is also in $C$.) We seek a minimum-link $C$-oriented tour of $E$, that is, a polygonal path $π$ from $s$ to $t$ that visits the segments of $E$ in order, such that, the orientations of its edges are in $C$ and their number is minimum. We present an algorithm for computing such a tour in $O(|C|^2 \cdot n^2)$ time. This problem already captures most of the difficulties occurring in the study of the more general problem, in which $E$ is a set of not-necessarily-disjoint $C$-oriented polygons.
The optimal visiting problem is the optimization of a trajectory that has to touch or pass as close as possible to a collection of target points. The problem does not verify the dynamic programming principle, and it needs a specific formulation to keep track of the visited target points. In this paper, we introduce a hybrid approach by adding a discontinuous part of the trajectory switching between a group of discrete states related to the targets. Then, we show the well-posedness of the related Hamilton-Jacobi problem, by reformulating the optimal visiting as a collection of time-dependent optimal stopping problems.
In an optimal visiting problem, we want to control a trajectory that has to pass as close as possible to a collection of target points or regions. We introduce a hybrid control-based approach for the classic problem where the trajectory can switch between a group of discrete states related to the targets of the problem. The model is subsequently adapted to a mean-field framework to study viability and crowd fluxes to model a multitude of indistinguishable players.
This article considers two variants of a shortest path problem for a car-like robot visiting a set of waypoints. The sequence of waypoints to be visited is specified in the first variant while the robot is allowed to visit the waypoints in any sequence in the second variant. Field of view constraints are also placed when the robot arrives at a waypoint, i.e., the orientation of the robot at any waypoint is restricted to belong to a given interval of angles at the waypoint. The shortest path problem is first solved for two waypoints with the field of view constraints using Pontryagin's minimum principle. Using the results for the two point problem, tight lower and upper bounds on the length of the shortest path are developed for visiting n points by relaxing the requirement that the arrival angle must be equal to the departure angle of the robot at each waypoint. Theoretical bounds are also provided on the length of the feasible solutions obtained by the proposed algorithm. Simulation results verify the performance of the bounds for instances with 20 waypoints.
In this paper, we study the probability of visiting a distant point $a\in \mathbb{Z}^4$ by critical branching random walk starting from the origin. We prove that this probability is bounded by $1/(|a|^2\log |a|)$ up to a constant.
The class of random walks in one dimension, returning to the origin, restricted by the requirement that any site visited (different from the origin) is visited an even number of times, is analyzed in the present note. We call this class the even-visiting random walks and provide a closed expression to evaluate them.
We extend the theory of discrete capacity to critical branching random walk. We introduce branching capacity for any finite subset of $\Z^d, d\geq5$. Analogous to the regular discrete capacity, branching capacity is closely related to the asymptotics of the probability of visiting a fixed finite set by a critical branching random walk starting from a distant point and the conditional distribution of the hitting point.
This paper addresses the Counting Long Aggregated Visits problem, which is defined as follows. We are given $n$ users and $m$ regions, where each user spends some time visiting some regions. For a parameter $k$ and a query consisting of a subset of $r$ regions, the task is to count the number of distinct users whose aggregate time spent visiting the query regions is at least $k$. This problem is motivated by queries arising in the analysis of large-scale mobility datasets. We present several exact and approximate data structures for supporting counting long aggregated visits, as well as conditional and unconditional lower bounds. First, we describe an exact data structure that exhibits a space-time tradeoff, as well as efficient approximate solutions based on sampling and sketching techniques. We then study the problem in geometric settings where regions are points in $\mathbb{R}^d$ and queries are hyperrectangles, and derive exact data structures that achieve improved performance in these structured spaces.
Maximum entropy reinforcement learning motivates agents to explore states and actions to maximize the entropy of some distribution, typically by providing additional intrinsic rewards proportional to that entropy function. In this paper, we study intrinsic rewards proportional to the entropy of the discounted distribution of state-action features visited during future time steps. This approach is motivated by two results. First, we show that the expected sum of these intrinsic rewards is a lower bound on the entropy of the discounted distribution of state-action features visited in trajectories starting from the initial states, which we relate to an alternative maximum entropy objective. Second, we show that the distribution used in the intrinsic reward definition is the fixed point of a contraction operator and can therefore be estimated off-policy. Experiments highlight that the new objective leads to improved visitation of features within individual trajectories, in exchange for slightly reduced visitation of features in expectation over different trajectories, as suggested by the lower bound. It also leads to improved convergence speed for learning exploration-only agents. Cont
Understanding where people go after visiting one business is crucial for urban planning, retail analytics, and location-based services. However, predicting these co-visitation patterns across millions of venues remains challenging due to extreme data sparsity and the complex interplay between spatial proximity and business relationships. Traditional approaches using only geographic distance fail to capture why coffee shops attract different customer flows than fine dining restaurants, even when co-located. We introduce NAICS-aware GraphSAGE, a novel graph neural network that integrates business taxonomy knowledge through learnable embeddings to predict population-scale co-visitation patterns. Our key insight is that business semantics, captured through detailed industry codes, provide crucial signals that pure spatial models cannot explain. The approach scales to massive datasets (4.2 billion potential venue pairs) through efficient state-wise decomposition while combining spatial, temporal, and socioeconomic features in an end-to-end framework. Evaluated on our POI-Graph dataset comprising 94.9 million co-visitation records across 92,486 brands and 48 US states, our method achieve
Pinwheel Scheduling is a fundamental scheduling problem, in which each task $i$ is associated with a positive integer $d_i$, and the objective is to schedule one task per time slot, ensuring each task perpetually appears at least once in every $d_i$ time slots. Although conjectured to be PSPACE-complete, it remains open whether Pinwheel Scheduling is NP-hard (unless a compact input encoding is used) or even contained in NP. We introduce k-Visits, a finite version of Pinwheel Scheduling, where given n deadlines, the goal is to schedule each task exactly k times. While we observe that the 1-Visit problem is trivial, we prove that 2-Visits is strongly NP-complete through a surprising reduction from Numerical 3-Dimensional Matching (N3DM). As intermediate steps in the reduction, we define NP-complete variants of N3DM which may be of independent interest. We further extend our strong NP-hardness result to a generalization of k-Visits $k\geq 2$ in which the deadline of each task may vary throughout the schedule, as well as to a similar generalization of Pinwheel Scheduling, thus making progress towards settling the complexity of Pinwheel Scheduling. Additionally, we prove that 2-Visits c
When customers must visit a seller to learn the valuation of its product, sellers potentially benefit from charging a lower price on the first visit and a higher price when a buyer returns. Armstrong and Zhou (2016) show that such price discrimination can arise in equilibrium when buyers learn a seller's pricing policy only upon visiting. We depart from this assumption by supposing that sellers commit to observable pricing policies that guide consumer search and buyers can choose whom to visit first. We show that no seller engages in price discrimination in equilibrium.
Alzheimer's disease (AD) is a neurodegenerative disorder with no known cure that affects tens of millions of people worldwide. Early detection of AD is critical for timely intervention to halt or slow the progression of the disease. In this study, we propose a Transformer model for predicting the stage of AD progression at a subject's next clinical visit using features from a sequence of visits extracted from the subject's visit history. We also rigorously compare our model to recurrent neural networks (RNNs) such as long short-term memory (LSTM), gated recurrent unit (GRU), and minimalRNN and assess their performances based on factors such as the length of prior visits and data imbalance. We test the importance of different feature categories and visit history, as well as compare the model to a newer Transformer-based model optimized for time series. Our model demonstrates strong predictive performance despite missing visits and missing features in available visits, particularly in identifying converter subjects -- individuals transitioning to more severe disease stages -- an area that has posed significant challenges in longitudinal prediction. The results highlight the model's p
Electronic health records (EHRs) provide an efficient approach to generating rich longitudinal datasets. However, since patients visit as needed, the assessment times are typically irregular and may be related to the patient's health. Failing to account for this informative assessment process could result in biased estimates of the disease course. In this paper, we show how estimation of the disease trajectory can be enhanced by leveraging an underutilized piece of information that is often in the patient's EHR: physician-recommended intervals between visits. Specifically, we demonstrate how recommended intervals can be used in characterizing the assessment process, and in investigating the sensitivity of the results to assessment not at random (ANAR). We illustrate our proposed approach in a clinic-based cohort study of juvenile dermatomyositis (JDM). In this study, we found that the recommended intervals explained 78% of the variability in the assessment times. Under a specific case of ANAR where we assumed that a worsening in disease led to patients visiting earlier than recommended, the estimated population average disease activity trajectory was shifted downward relative to th