共找到 20 条结果
Pursuing a continuous visual representation that offers flexible frequency modulation and fast rendering speed has recently garnered increasing attention in the fields of 3D vision and graphics. However, existing representations often rely on frequency guidance or complex neural network decoding, leading to spectrum loss or slow rendering. To address these limitations, we propose WIPES, a universal Wavelet-based vIsual PrimitivES for representing multi-dimensional visual signals. Building on the spatial-frequency localization advantages of wavelets, WIPES effectively captures both the low-frequency "forest" and the high-frequency "trees." Additionally, we develop a wavelet-based differentiable rasterizer to achieve fast visual rendering. Experimental results on various visual tasks, including 2D image representation, 5D static and 6D dynamic novel view synthesis, demonstrate that WIPES, as a visual primitive, offers higher rendering quality and faster inference than INR-based methods, and outperforms Gaussian-based representations in rendering quality.
Researchers found a new way to kill harmful “zombie” cells that linger after chemotherapy and help cancers become more aggressive。 These senescent cells survive by relying on a protective protein called GPX4, even while sitting on the edge of a deadly iron-triggered collapse。 New drugs remove that protection, causing the cells to self-destruct
The Toba supereruption 74,000 years ago was so massive it may have plunged Earth into years of darkness and cold, leading some scientists to believe humanity nearly went extinct。 Yet archaeological evidence from Africa and Asia suggests early humans were far more resilient than once thought。 Instead of disappearing, some communities adapted with ne
This data article presents a dataset of 11,884 labeled images documenting a simulated blood extraction (phlebotomy) procedure performed on a training arm. Images were extracted from high-definition videos recorded under controlled conditions and curated to reduce redundancy using Structural Similarity Index Measure (SSIM) filtering. An automated face-anonymization step was applied to all videos prior to frame selection. Each image contains polygon annotations for five medically relevant classes: syringe, rubber band, disinfectant wipe, gloves, and training arm. The annotations were exported in a segmentation format compatible with modern object detection frameworks (e.g., YOLOv8), ensuring broad usability. This dataset is partitioned into training (70%), validation (15%), and test (15%) subsets and is designed to advance research in medical training automation and human-object interaction. It enables multiple applications, including phlebotomy tool detection, procedural step recognition, workflow analysis, conformance checking, and the development of educational systems that provide structured feedback to medical trainees. The data and accompanying label files are publicly availabl
Robots often struggle to follow free-form human instructions in real-world settings due to computational and sensing limitations. We address this gap with a lightweight, fully on-device pipeline that converts natural-language commands into reliable manipulation. Our approach has two stages: (i) the instruction to actions module (Instruct2Act), a compact BiLSTM with a multi-head-attention autoencoder that parses an instruction into an ordered sequence of atomic actions (e.g., reach, grasp, move, place); and (ii) the robot action network (RAN), which uses the dynamic adaptive trajectory radial network (DATRN) together with a vision-based environment analyzer (YOLOv8) to generate precise control trajectories for each sub-action. The entire system runs on a modest system with no cloud services. On our custom proprietary dataset, Instruct2Act attains 91.5% sub-actions prediction accuracy while retaining a small footprint. Real-robot evaluations across four tasks (pick-place, pick-pour, wipe, and pick-give) yield an overall 90% success; sub-action inference completes in < 3.8s, with end-to-end executions in 30-60s depending on task complexity. These results demonstrate that fine-grain
Currently, manipulation tasks for deformable objects often focus on activities like folding clothes, handling ropes, and manipulating bags. However, research on contact-rich tasks involving deformable objects remains relatively underdeveloped. When humans use cloth or sponges to wipe surfaces, they rely on both vision and tactile feedback. Yet, current algorithms still face challenges with issues like occlusion, while research on tactile perception for manipulation is still evolving. Tasks such as covering surfaces with deformable objects demand not only perception but also precise robotic manipulation. To address this, we propose a method that leverages efficient and accessible simulators for task execution. Specifically, we train a reinforcement learning agent in a simulator to manipulate deformable objects for surface wiping tasks. We simplify the state representation of object surfaces using harmonic UV mapping, process contact feedback from the simulator on 2D feature maps, and use scaled grouped convolutions (SGCNN) to extract features efficiently. The agent then outputs actions in a reduced-dimensional action space to generate coverage paths. Experiments demonstrate that our
This paper presents a typical design of the RF section of a radar receiver, the chain within a superheterodyne dual-conversion architecture. A significant challenge in this framework is the occurrence of spur signals, which negatively impact the dynamic range of the RF chain. When addressing this issue, the paper introduces an innovative approach to mitigate (or even wipe out) these undesired effects, utilizing two mutually verifying MATLAB codes. These codes have been tested with two distinct commercial mixers and could be applied to any superheterodyne configuration with various mixers. The presented method makes the Spurious-Free Dynamic Range (SFDR) of the chain the least different from the dynamic range of the chain. Also, the selection of other components gets optimized to align with spurious signals consideration, with explanations provided for these choices. Moreover, two filters of the RF chain, the second and the third, have been designed to reduce implementation costs. Various Microwave software and full-wave analyses were employed for detailed design and analysis, with the results compared to evaluate their performance.
The rapid aging of societies is intensifying demand for autonomous care robots; however, most existing systems are task-specific and rely on handcrafted preprocessing, limiting their ability to generalize across diverse scenarios. A prevailing theory in cognitive neuroscience proposes that the human brain operates through hierarchical predictive processing, which underlies flexible cognition and behavior by integrating multimodal sensory signals. Inspired by this principle, we introduce a hierarchical multimodal recurrent neural network grounded in predictive processing under the free-energy principle, capable of directly integrating over 30,000-dimensional visuo-proprioceptive inputs without dimensionality reduction. The model was able to learn two representative caregiving tasks, rigid-body repositioning and flexible-towel wiping, without task-specific feature engineering. We demonstrate three key properties: (i) self-organization of hierarchical latent dynamics that regulate task transitions, capture variability in uncertainty, and infer occluded states; (ii) robustness to degraded vision through visuo-proprioceptive integration; and (iii) asymmetric interference in multitask le
The rapid advancement of neural radiance fields (NeRF) has paved the way to generate animatable human avatars from a monocular video. However, the sole usage of NeRF suffers from a lack of details, which results in the emergence of hybrid representation that utilizes SMPL-based mesh together with NeRF representation. While hybrid-based models show photo-realistic human avatar generation qualities, they suffer from extremely slow inference due to their deformation scheme: to be aligned with the mesh, hybrid-based models use the deformation based on SMPL skinning weights, which needs high computational costs on each sampled point. We observe that since most of the sampled points are located in empty space, they do not affect the generation quality but result in inference latency with deformation. In light of this observation, we propose EPSilon, a hybrid-based 3D avatar generation scheme with novel efficient point sampling strategies that boost both training and inference. In EPSilon, we propose two methods to omit empty points at rendering; empty ray omission (ERO) and empty interval omission (EIO). In ERO, we wipe out rays that progress through the empty space. Then, EIO narrows do
Recent progress in humanoid robots has unlocked agile locomotion skills, including backflipping, running, and crawling. Yet it remains challenging for a humanoid robot to perform forceful manipulation tasks such as moving objects, wiping, and pushing a cart. We propose adaptive Compliance Humanoid control through hIsight Perturbation (CHIP), a plug-and-play module that enables controllable end-effector stiffness while preserving agile tracking of dynamic reference motions. CHIP is easy to implement and requires neither data augmentation nor additional reward tuning. We show that a generalist motion-tracking controller trained with CHIP can perform a diverse set of forceful manipulation tasks that require different end-effector compliance, such as multi-robot collaboration, wiping, box delivery, and door opening.
Autonomous robotic wiping is an important task in various industries, ranging from industrial manufacturing to sanitization in healthcare. Deep reinforcement learning (Deep RL) has emerged as a promising algorithm, however, it often suffers from a high demand for repetitive reward engineering. Instead of relying on manual tuning, we first analyze the convergence of quality-critical robotic wiping, which requires both high-quality wiping and fast task completion, to show the poor convergence of the problem and propose a new bounded reward formulation to make the problem feasible. Then, we further improve the learning process by proposing a novel visual-language model (VLM) based curriculum, which actively monitors the progress and suggests hyperparameter tuning. We demonstrate that the combined method can find a desirable wiping policy on surfaces with various curvatures, frictions, and waypoints, which cannot be learned with the baseline formulation. The demo of this project can be found at: https://sites.google.com/view/highqualitywiping.
Imitation learning offers a pathway for robots to perform repetitive tasks, allowing humans to focus on more engaging and meaningful activities. However, challenges arise from the need for extensive demonstrations and the disparity between training and real-world environments. This paper focuses on contact-rich tasks like wiping with soft and deformable objects, requiring adaptive force control to handle variations in wiping surface height and the sponge's physical properties. To address these challenges, we propose a novel method that integrates real-time force-torque (FT) feedback with pre-trained object representations. This approach allows robots to dynamically adjust to previously unseen changes in surface heights and sponges' physical properties. In real-world experiments, our method achieved 96% accuracy in applying reference forces, significantly outperforming the previous method that lacked an FT feedback loop, which only achieved 4% accuracy. To evaluate the adaptability of our approach, we conducted experiments under different conditions from the training setup, involving 40 scenarios using 10 sponges with varying physical properties and 4 types of wiping surface heights
$^{210}Po$ $α$-decay driven neutron background is a concern for many rare event search experiments. It is a difficult to control background because its radiogenic component depends on the air exposure history of parts. In this study, we demonstrate that about half of the radon progeny $^{210}Po$ can be removed from copper and silicon surfaces relatively easily by wiping a copper sample with acetone wetted tissue and a silicon detector with acetone soaked cotton balls. For a copper sample we demonstrate that long-lived $^{210}Pb$ is removed with similar effectiveness. For copper, allocated the longest counting time, additional wiping was found to be largely ineffective. For silicon, the removal effectiveness has large uncertainties. Additional cleaning showed a small but statistically significant effect. Capitalizing on this trivial cleaning step will allow experiments to relax their requirements on the allowable air exposure time during construction, leading to cost and time savings.
Deep neural networks used for reconstructing sparse-view CT data are typically trained by minimizing a pixel-wise mean-squared error or similar loss function over a set of training images. However, networks trained with such pixel-wise losses are prone to wipe out small, low-contrast features that are critical for screening and diagnosis. To remedy this issue, we introduce a novel training loss inspired by the model observer framework to enhance the detectability of weak signals in the reconstructions. We evaluate our approach on the reconstruction of synthetic sparse-view breast CT data, and demonstrate an improvement in signal detectability with the proposed loss.
Wiping behavior is a task of tracing the surface of an object while feeling the force with the palm of the hand. It is necessary to adjust the force and posture appropriately considering the various contact conditions felt by the hand. Several studies have been conducted on the wiping motion, however, these studies have only dealt with a single surface material, and have only considered the application of the amount of appropriate force, lacking intelligent movements to ensure that the force is applied either evenly to the entire surface or to a certain area. Depending on the surface material, the hand posture and pressing force should be varied appropriately, and this is highly dependent on the definition of the task. Also, most of the movements are executed by high-rigidity robots that are easy to model, and few movements are executed by robots that are low-rigidity but therefore have a small risk of damage due to excessive contact. So, in this study, we develop a method of motion generation based on the learned prediction of contact force during the wiping motion of a low-rigidity robot. We show that MyCobot, which is made of low-rigidity resin, can appropriately perform wiping
Electromagnetic wiping systems allow to pre-meter the coating thickness of the liquid metal on a moving substrate. These systems have the potential to provide a more uniform coating and significantly higher production rates compared to pneumatic wiping, but they require substantially larger amounts of energy. This work presents a multi-objective optimization accounting for (1) maximal wiping efficiency (2) maximal smoothness of the wiping meniscus, and (3) minimal Joule heating. We present the Pareto front, identifying the best wiping conditions given a set of weights for the three competing objectives. The optimization was based on a 1D steady-state integral model, whose prediction scales according to the Hartmann number (Ha). The optimization uses a multi-gradient approach, with gradients computed with a combination of finite differences and variational methods. The results show that the wiping efficiency depends solely on Ha and not the magnetic field distribution. Moreover, we show that the liquid thickness becomes insensitive to the intensity of the magnetic field above a certain threshold and that the current distribution (hence the Joule heating) is mildly affected by the ma
Magnetohydrodynamic (MHD) waves are often invoked to interpret quasi-periodic pulsations (QPPs) in solar flares. We study the response of a straight flare loop to a kink-like velocity perturbation using three-dimensional MHD simulations and forward model the microwave emissions using the fast gyrosynchrotron code. Kink motions with two periodicities are simultaneously generated,with the long-period component P_L = 57s being attributed to the radial fundamental kink mode and the short-period component P_S=5.8s to the first leaky kink mode. Forward modeling results show that the two-periodic oscillations are detectable in the microwave intensities for some lines of sight. Increasing the beam size to (1")^2 does not wipe out the microwave oscillations. We propose that the first leaky kink mode is a promising candidate mechanism to account for short-period QPPs. Radio telescopes with high spatial resolutions can help distinguish between this new mechanism with such customary interpretations as sausage modes.
A patient firm interacts with a sequence of consumers. The firm is either an honest type who supplies high quality and never erases its records, or an opportunistic type who chooses what quality to supply and may erase its records at a low cost. We show that in every equilibrium, the firm has an incentive to build a reputation for supplying high quality until its continuation value exceeds its commitment payoff, but its ex ante payoff must be close to its minmax value when it has a sufficiently long lifespan. Therefore, even a small fraction of opportunistic types can wipe out the firm's returns from building reputations. Even if the honest type can commit to reveal information about its history according to any disclosure policy, the opportunistic type's payoff cannot exceed its equilibrium payoff when the consumers receive no information.
Some actions must be executed in different ways depending on the context. For example, wiping away marker requires vigorous force while wiping away almonds requires more gentle force. In this paper we provide a model where an agent learns which manner of action execution to use in which context, drawing on evidence from trial and error and verbal corrections when it makes a mistake (e.g., ``no, gently''). The learner starts out with a domain model that lacks the concepts denoted by the words in the teacher's feedback; both the words describing the context (e.g., marker) and the adverbs like ``gently''. We show that through the the semantics of coherence, our agent can perform the symbol grounding that's necessary for exploiting the teacher's feedback so as to solve its domain-level planning problem: to perform its actions in the current context in the right way.