Robotic fruit harvesting often fails to reliably detect whether a fruit has been successfully picked, limiting efficiency and increasing crop damage. This problem is difficult due to compliant fruit and grippers, variable stem attachment, and occlusions in orchard environments. Prior work has explored vision-based perception and multi-sensor learning approaches for pick state estimation. However, minimal sensor sets and phase-dependent sensing strategies for accurate pick and slip detection remain largely unexplored. In this work, we design and evaluate a multimodal sensing suite integrated into a compliant suction-based apple gripper. Our approach is unique because it identifies which sensors are most informative at different phases of the pick, enabling predictive detection of failures before they occur. The contributions of this paper are a phase-dependent evaluation of multimodal sensors and the identification of minimal sensor sets for reliable pick state classification. Experiments in a real apple orchard show that Random Forest and Multilayer Perceptron classifiers detect successful picks and impending failures with over 90% accuracy, and Random Forest predicts pick/slip eve
Counterfactual explanations are widely used to communicate how inputs must change for a model to alter its prediction. For a single instance, many valid counterfactuals can exist, which leaves open the possibility for an explanation provider to cherry-pick explanations that better suit a narrative of their choice, highlighting favourable behaviour and withholding examples that reveal problematic behaviour. We formally define cherry-picking for counterfactual explanations in terms of an admissible explanation space, specified by the generation procedure, and a utility function. We then study to what extent an external auditor can detect such manipulation. Considering three levels of access to the explanation process: full procedural access, partial procedural access, and explanation-only access, we show that detection is extremely limited in practice. Even with full procedural access, cherry-picked explanations can remain difficult to distinguish from non cherry-picked explanations, because the multiplicity of valid counterfactuals and flexibility in the explanation specification provide sufficient degrees of freedom to mask deliberate selection. Empirically, we demonstrate that thi
The food packaging industry goes through changes in food items and their weights quite rapidly. These items range from easy-to-pick, single-piece food items to flexible, long and cluttered ones. We propose a replaceable bit-based gripper system to tackle the challenge of weight-based handling of cluttered food items. The gripper features specialized food attachments(bits) that enhance its grasping capabilities, and a belt replacement system allows switching between different food items during packaging operations. It offers a wide range of control options, enabling it to grasp and drop specific weights of granular, cluttered, and entangled foods. We specifically designed bits for two flexible food items that differ in shape: ikura(salmon roe) and spaghetti. They represent the challenging categories of sticky, granular food and long, sticky, cluttered food, respectively. The gripper successfully picked up both spaghetti and ikura and demonstrated weight-specific dropping of these items with an accuracy over 80% and 95% respectively. The gripper system also exhibited quick switching between different bits, leading to the handling of a large range of food items.
Residual moveout (RMO) provides critical information for travel time tomography. The current industry-standard method for fitting RMO involves scanning high-order polynomial equations. However, this analytical approach does not accurately capture local saltation, leading to low iteration efficiency in tomographic inversion. Supervised learning-based image segmentation methods for picking can effectively capture local variations; however, they encounter challenges such as a scarcity of reliable training samples and the high complexity of post-processing. To address these issues, this study proposes a deep learning-based cascade picking method. It distinguishes accurate and robust RMOs using a segmentation network and a post-processing technique based on trend regression. Additionally, a data synthesis method is introduced, enabling the segmentation network to be trained on synthetic datasets for effective picking in field data. Furthermore, a set of metrics is proposed to quantify the quality of automatically picked RMOs. Experimental results based on both model and real data demonstrate that, compared to semblance-based methods, our approach achieves greater picking density and acc
First-break picking is an essential step in seismic data processing. First arrivals should be picked by an expert. This is a time-consuming procedure and subjective to a certain degree, leading to different results for different operators. In this study, we used a U-Net architecture with residual blocks to perform automatic first-break picking based on deep learning. Focusing on the effects of weight initialization on this process, we conduct this research by using the weights of a pretrained network that is used for object detection on the ImageNet dataset. The efficiency of the proposed method is tested on two real datasets. For both datasets, we pick manually the first breaks for less than 10% of the seismic shots. The pretrained network is fine-tuned on the picked shots and the rest of the shots are automatically picked by the neural network. It is shown that this strategy allows to reduce the size of the training set, requiring fine tuning with only a few picked shots per survey. Using random weights and more training epochs can lead to a lower training loss, but such a strategy leads to overfitting as the test error is higher than the one of the pretrained network. We also as
The aim of this article is to give an elementary proof of the fact that the Schwarz-Pick Lemma follows from the Ahlfors-Schwarz-Pick Lemma.
This work demonstrates how autonomously learning aspects of robotic operation from sparsely-labeled, real-world data of deployed, engineered solutions at industrial scale can provide with solutions that achieve improved performance. Specifically, it focuses on multi-suction robot picking and performs a comprehensive study on the application of multi-modal visual encoders for predicting the success of candidate robotic picks. Picking diverse items from unstructured piles is an important and challenging task for robot manipulation in real-world settings, such as warehouses. Methods for picking from clutter must work for an open set of items while simultaneously meeting latency constraints to achieve high throughput. The demonstrated approach utilizes multiple input modalities, such as RGB, depth and semantic segmentation, to estimate the quality of candidate multi-suction picks. The strategy is trained from real-world item picking data, with a combination of multimodal pretrain and finetune. The manuscript provides comprehensive experimental evaluation performed over a large item-picking dataset, an item-picking dataset targeted to include partial occlusions, and a package-picking da
Strawberries naturally grow in clusters, interwoven with leaves, stems, and other fruits, which frequently leads to occlusion. This inherent growth habit presents a significant challenge for robotic picking, as traditional percept-plan-control systems struggle to reach fruits amid the clutter. Effectively picking an occluded strawberry demands dexterous manipulation to carefully bypass or gently move the surrounding soft objects and precisely access the ideal picking point located at the stem just above the calyx. To address this challenge, we introduce a strawberry-picking robotic system that learns from human demonstrations. Our system features a 4-DoF SCARA arm paired with a human teleoperation interface for efficient data collection and leverages an End Pose Assisted Action Chunking Transformer (ACT) to develop a fine-grained visuomotor picking policy. Experiments under various occlusion scenarios demonstrate that our modified approach significantly outperforms the direct implementation of ACT, underscoring its potential for practical application in occluded strawberry picking.
When objects are packed in a cluster, physical interactions are unavoidable. Such interactions emerge because of the objects geometric features; some of these features promote entanglement, while others create repulsion. When entanglement occurs, the cluster exhibits a global, complex behaviour, which arises from the stochastic interactions between objects. We hereby refer to such a cluster as an entangled granular metamaterial. We investigate the geometrical features of the objects which make up the cluster, henceforth referred to as grains, that maximise entanglement. We hypothesise that a cluster composed from grains with high propensity to tangle, will also show propensity to interact with a second cluster of tangled objects. To demonstrate this, we use the entangled granular metamaterials to perform complex robotic picking tasks, where conventional grippers struggle. We employ an electromagnet to attract the metamaterial (ferromagnetic) and drop it onto a second cluster of objects (targets, non-ferromagnetic). When the electromagnet is re-activated, the entanglement ensures that both the metamaterial and the targets are picked, with varying degrees of physical engagement that
New selected values of odd random simplex volumetric moments (moments of the volume of a random simplex picked from a given body) are derived in an exact form in various bodies in dimensions three, four, five, and six. In three dimensions, the well-known Efron's formula was used by Buchta & Reitzner and Zinani to deduce the mean volume of a random tetrahedron in a tetrahedron and a cube. However, for higher moments and/or in higher dimensions, the method fails. As it turned out, the same problem is also solvable using the Blashke-Petkantschin formula in Cartesian parametrisation in the form of the Canonical Section Integral (Base-height splitting). In our presentation, we show how to derive the older results mentioned above using our base-height splitting method and also touch on the essential steps of how the method translates to higher dimensions and for higher moments.
First-break picking is a pivotal procedure in processing microseismic data for geophysics and resource exploration. Recent advancements in deep learning have catalyzed the evolution of automated methods for identifying first-break. Nevertheless, the complexity of seismic data acquisition and the requirement for detailed, expert-driven labeling often result in outliers and potential mislabeling within manually labeled datasets. These issues can negatively affect the training of neural networks, necessitating algorithms that handle outliers or mislabeled data effectively. We introduce the Simultaneous Picking and Refinement (SPR) algorithm, designed to handle datasets plagued by outlier samples or even noisy labels. Unlike conventional approaches that regard manual picks as ground truth, our method treats the true first-break as a latent variable within a probabilistic model that includes a first-break labeling prior. SPR aims to uncover this variable, enabling dynamic adjustments and improved accuracy across the dataset. This strategy mitigates the impact of outliers or inaccuracies in manual labels. Intra-site picking experiments and cross-site generalization experiments on publicl
Picking sequences are well-established methods for allocating indivisible goods. Among the various picking sequences, recursively balanced picking sequences -- whereby each agent picks one good in every round -- are notable for guaranteeing allocations that satisfy envy-freeness up to one good. In this paper, we compare the fairness of different recursively balanced picking sequences using two key measures. Firstly, we demonstrate that all such sequences have the same price in terms of egalitarian welfare relative to other picking sequences. Secondly, we characterize the approximate maximin share (MMS) guarantees of these sequences. In particular, we show that compensating the agent who picks last in the first round by letting her pick first in every subsequent round yields the best MMS guarantee.
Robotic pick and place stands at the heart of autonomous manipulation. When conducted in cluttered or complex environments robots must jointly reason about the selected grasp and desired placement locations to ensure success. While several works have examined this joint pick-and-place problem, none have fully leveraged recent learning-based approaches for multi-fingered grasp planning. We present a modular algorithm for joint pick and place planning that can make use of state of the art grasp classifiers for planning multi-fingered grasps for novel objects from partial view point clouds. We demonstrate our joint pick and place formulation with several costs associated with different placement tasks. Experiments on pick and place tasks with cluttered scenes using a physical robot show that our joint inference method is more successful than a sequential pick then place approach, while also achieving better placement configurations.
This paper presents a robotic in-hand manipulation technique that can be applied to pick an object too large to grasp in a prehensile manner, by taking advantage of its contact interactions with a curved, passive end-effector, and two flat support surfaces. First, the object is tilted up while being held between the end-effector and the supports. Then, the end-effector is tucked into the gap underneath the object, which is formed by tilting, in order to obtain a grasp against gravity. In this paper, we first examine the mechanics of tilting to understand the different ways in which the object can be initially tilted. We then present a strategy to tilt up the object in a secure manner. Finally, we demonstrate successful picking of objects of various size and geometry using our technique through a set of experiments performed with a custom-made robotic device and a conventional robot arm. Our experiment results show that object picking can be performed reliably with our method using simple hardware and control, and when possible, with appropriate fixture design.
Picking diverse objects in the real world is a fundamental robotics skill. However, many objects in such settings are bulky, heavy, or irregularly shaped, making them ungraspable by conventional end effectors like suction grippers and parallel jaw grippers (PJGs). In this paper, we expand the range of pickable items without hardware modifications using bimanual nonprehensile manipulation. We focus on a grocery shopping scenario, where a bimanual mobile manipulator equipped with a suction gripper and a PJG is tasked with retrieving ungraspable items from tightly packed grocery shelves. From visual observations, our method first identifies optimal grasp points based on force closure and friction constraints. If the grasp points are occluded, a series of nonprehensile nudging motions are performed to clear the obstruction. A bimanual grasp utilizing contacts on the side of the end effectors is then executed to grasp the target item. In our replica grocery store, we achieved a 90% success rate over 102 trials in uncluttered scenes, and a 67% success rate over 45 trials in cluttered scenes. We also deployed our system to a real-world grocery store and successfully picked previously unse
Efficiency and reliability are critical in robotic bin-picking as they directly impact the productivity of automated industrial processes. However, traditional approaches, demanding static objects and fixed collisions, lead to deployment limitations, operational inefficiencies, and process unreliability. This paper introduces a Dynamic Bin-Picking Framework (DBPF) that challenges traditional static assumptions. The DBPF endows the robot with the reactivity to pick multiple moving arbitrary objects while avoiding dynamic obstacles, such as the moving bin. Combined with scene-level pose generation, the proposed pose selection metric leverages the Tendency-Aware Manipulability Network optimizing suction pose determination. Heuristic task-specific designs like velocity-matching, dynamic obstacle avoidance, and the resight policy, enhance the picking success rate and reliability. Empirical experiments demonstrate the importance of these components. Our method achieves an average 84% success rate, surpassing the 60% of the most comparable baseline, crucially, with zero collisions. Further evaluations under diverse dynamic scenarios showcase DBPF's robust performance in dynamic bin-pickin
Seismic phase picking, which aims to determine the arrival time of P- and S-waves according to seismic waveforms, is fundamental to earthquake monitoring. Generally, manual phase picking is trustworthy, but with the increasing number of worldwide stations and seismic monitors, it becomes more challenging for human to complete the task comprehensively. In this work, we explore multiple ways to do automatic phase picking, including traditional and learning-based methods.
Picking up multiple objects at once is a grasping skill that makes a human worker efficient in many domains. This paper presents a system to pick a requested number of objects by only picking once (OPO). The proposed Only-Pick-Once System (OPOS) contains several graph-based algorithms that convert the layout of objects into a graph, cluster nodes in the graph, rank and select candidate clusters based on their topology. OPOS also has a multi-object picking predictor based on a convolutional neural network for estimating how many objects would be picked up with a given gripper location and orientation. This paper presents four evaluation metrics and three protocols to evaluate the proposed OPOS. The results show OPOS has very high success rates for two and three objects when only picking once. Using OPOS can significantly outperform two to three times single object picking in terms of efficiency. The results also show OPOS can generalize to unseen size and shape objects.
Object picking in cluttered scenes is a widely investigated field of robot manipulation, however, ambidextrous robot picking is still an important and challenging issue. We found the fusion of different prehensile actions (grasp and suction) can expand the range of objects that can be picked by robot, and the fusion of prehensile action and nonprehensile action (push) can expand the picking space of ambidextrous robot. In this paper, we propose a Push-Grasp-Suction (PGS) tri-mode grasping learning network for ambidextrous robot picking through the fusion of different prehensile actions and the fusion of prehensile action and nonprehensile aciton. The prehensile branch of PGS takes point clouds as input, and the 6-DoF picking configuration of grasp and suction in cluttered scenes are generated by multi-task point cloud learning. The nonprehensile branch with depth image input generates instance segmentation map and push configuration, cooperating with the prehensile actions to complete the picking of objects out of single-arm space. PGS generalizes well in real scene and achieves state-of-the-art picking performance.