Key\-value separation is used in LSM\-tree to stored large value in separate log files to reduce write amplification, but requires garbage collection to garbage collect invalid values. Existing garbage collection techniques in LSM\-tree typically adopt static parameter based garbage collection to garbage collect obsolete values which struggles to achieve low write amplification and it's challenging to find proper parameter for garbage collection triggering. In this work we introduce DumpKV, which introduces learning based lifetime aware garbage collection with dynamic lifetime adjustment to do efficient garbage collection to achieve lower write amplification. DumpKV manages large values using trained lightweight model with features suitable for various application based on past write access information of keys to give lifetime prediction for each individual key to enable efficient garbage collection. To reduce interference to write throughput DumpKV conducts feature collection during L0\-L1 compaction leveraging the fact that LSM\-tree is small under KV separation. Experimental results show that DumpKV achieves lower write amplification by 38\%\-73\% compared to existing key\-value
The multilayer garbage disposal game is an evolution of the garbage disposal game. Each layer represents a social relationship within a system of finitely many individuals and finitely many layers. An agent can redistribute their garbage and offload it onto their social neighbors in each layer at each time step. We study the game from a mathematical perspective rather than applying game theory. We investigate the scenario where all agents choose to average their garbage before offloading an equal proportion of it onto their social neighbors. It turns out that the garbage amounts of all agents in all layers converge to the initial average of all agents across all layers when all social graphs are connected and have order at least three. This implies that the winners are those agents whose initial total garbage exceeds the average total garbage across all agents.
Tabular machine learning presents a paradox: modern models achieve state-of-the-art performance using high-dimensional (high-D), collinear, error-prone data, defying the "Garbage In, Garbage Out" mantra. To help resolve this, we synthesize principles from Information Theory, Latent Factor Models, and Psychometrics, clarifying that predictive robustness arises not solely from data cleanliness, but from the synergy between data architecture and model capacity. Partitioning predictor-space "noise" into "Predictor Error" and "Structural Uncertainty" (informational deficits from stochastic generative mappings), we prove that leveraging high-D sets of error-prone predictors asymptotically overcomes both types of noise, whereas cleaning a low-D set is fundamentally bounded by Structural Uncertainty. We demonstrate why "Informative Collinearity" (dependencies from shared latent causes) enhances reliability and convergence efficiency, and explain why increased dimensionality reduces the latent inference burden, enabling feasibility with finite samples. To address practical constraints, we propose "Proactive Data-Centric AI" to identify predictors that enable robustness efficiently. We also
Efficient waste management and recycling heavily rely on garbage exploration and identification. In this study, we propose GSA2Seg (Garbage Segmentation and Attribute Analysis), a novel visual approach that utilizes quadruped robotic dogs as autonomous agents to address waste management and recycling challenges in diverse indoor and outdoor environments. Equipped with advanced visual perception system, including visual sensors and instance segmentators, the robotic dogs adeptly navigate their surroundings, diligently searching for common garbage items. Inspired by open-vocabulary algorithms, we introduce an innovative method for object attribute analysis. By combining garbage segmentation and attribute analysis techniques, the robotic dogs accurately determine the state of the trash, including its position and placement properties. This information enhances the robotic arm's grasping capabilities, facilitating successful garbage retrieval. Additionally, we contribute an image dataset, named GSA2D, to support evaluation. Through extensive experiments on GSA2D, this paper provides a comprehensive analysis of GSA2Seg's effectiveness. Dataset available: \href{https://www.kaggle.com/dat
The garbage disposal game involves a finite set of individuals, each of whom updates their garbage by either receiving from or dumping onto others. We examine the case where only social neighbors, whose garbage levels differ by a given threshold, can offload an equal proportion of their garbage onto others. Remarkably, in the absence of this threshold, the garbage amounts of all individuals converge to the initial average on any connected social graph that is not a star.
The Virtual Garbage Collector (VGC) proposes a zone-based memory management architecture aimed at improving execution predictability and memory behavior in Python runtimes. The design explores a dual-layer model consisting of an Active VGC, responsible for managing runtime object lifecycles, and a Passive VGC, intended as a compile-time optimization layer for static allocation planning. Rather than relying on traditional heap traversal or generational heuristics, VGC introduces memory zoning and checkpoint-based state evaluation to reduce allocation churn and constrain garbage collection scope. Execution partitioning is experimentally evaluated to isolate workloads and localize memory pressure, enabling more deterministic behavior under loop-intensive, recursive, and compute-heavy workloads. This work presents the architectural principles, execution model, and experimental observations of VGC within a partition-aware runtime context. While the full realization of the dual-layer design is an ongoing effort, the results indicate that zone-based allocation and partitioned execution provide a viable foundation for improving scalability and memory predictability in Python-oriented syste
Household garbage images are usually faced with complex backgrounds, variable illuminations, diverse angles, and changeable shapes, which bring a great difficulty in garbage image classification. Due to the ability to discover problem-specific features, deep learning and especially convolutional neural networks (CNNs) have been successfully and widely used for image representation learning. However, available and stable household garbage datasets are insufficient, which seriously limits the development of research and application. Besides, the state of the art in the field of garbage image classification is not entirely clear. To solve this problem, in this study, we built a new open benchmark dataset for household garbage image classification by simulating different lightings, backgrounds, angles, and shapes. This dataset is named 30 Classes of Household Garbage Images (HGI-30), which contains 18,000 images of 30 household garbage classes. The publicly available HGI-30 dataset allows researchers to develop accurate and robust methods for household garbage recognition. We also conducted experiments and performance analysis of the state-of-the-art deep CNN methods on HGI-30, which s
Garbage disposal is a challenging problem throughout the developed world. In Cyprus, as elsewhere, illegal ``fly-tipping" is a significant issue, especially in rural areas where few legal garbage disposal options exist. However, there is a lack of studies that attempt to measure the scale of this problem, and few resources available to address it. A method of automating the process of identifying garbage dumps would help counter this and provide information to the relevant authorities. The aim of this study was to investigate the degree to which artificial intelligence techniques, together with satellite imagery, can be used to identify illegal garbage dumps in the rural areas of Cyprus. This involved collecting a novel dataset of images that could be categorised as either containing, or not containing, garbage. The collection of such datasets in sufficient raw quantities is time consuming and costly. Therefore a relatively modest baseline set of images was collected, then data augmentation techniques used to increase the size of this dataset to a point where useful machine learning could occur. From this set of images an artificial neural network was trained to recognise the prese
This paper proposes a smart way to manage municipal solid waste by using the Internet of Things (IoT) and computer vision (CV) to monitor illegal waste dumping at garbage vulnerable points (GVPs) in urban areas. The system can quickly detect and monitor dumped waste using a street-level camera and object detection algorithm. Data was collected from the Sangareddy district in Telangana, India. A series of comprehensive experiments was carried out using the proposed dataset to assess the accuracy and overall performance of various object detection models. Specifically, we performed an in-depth evaluation of YOLOv8, YOLOv10, YOLO11m, and RT-DETR on our dataset. Among these models, YOLO11m achieved the highest accuracy of 92.39\% in waste detection, demonstrating its effectiveness in detecting waste. Additionally, it attains an mAP@50 of 0.91, highlighting its high precision. These findings confirm that the object detection model is well-suited for monitoring and tracking waste dumping events at GVP locations. Furthermore, the system effectively captures waste disposal patterns, including hourly, daily, and weekly dumping trends, ensuring comprehensive daily and nightly monitoring.
Although existing garbage collectors (GCs) perform extremely well on typical programs, there still exist pathological programs for which modern GCs significantly degrade performance. This observation begs the question: might there exist a 'holy grail' GC algorithm, as yet undiscovered, guaranteeing both constant-length pause times and that memory is collected promptly upon becoming unreachable? For decades, researchers have understood that such a GC is not always possible, i.e., some pathological behavior is unavoidable when the program can make heap cycles and operates near the memory limit, regardless of the GC algorithm used. However, this understanding has until now been only informal, lacking a rigorous formal proof. This paper complements that informal understanding with a rigorous proof, showing with mathematical certainty that every GC that can implement a realistic mutator-observer interface has some pathological program that forces it to either introduce a long GC pause into program execution or reject an allocation even though there is available space. Hence, language designers must either accept these pathological scenarios and design heuristic approaches that minimize
Rust is a non-Garbage Collected (GCed) language, but the lack of GC makes expressing data-structures that require shared ownership awkward, inefficient, or both. In this paper we explore a new design for, and implementation of, GC in Rust, called Alloy. Unlike previous approaches to GC in Rust, Alloy allows existing Rust destructors to be automatically used as GC finalizers: this makes Alloy integrate better with existing Rust code than previous solutions but introduces surprising soundness and performance problems. Alloy provides novel solutions for the core problems: finalizer safety analysis rejects unsound destructors from automatically being reused as finalizers; finalizer elision optimises away unnecessary finalizers; and premature finalizer prevention ensures that finalizers are only run when it is provably safe to do so.
The exponential growth in waste production due to rapid economic and industrial development necessitates efficient waste management strategies to mitigate environmental pollution and resource depletion. Leveraging advancements in computer vision, this study proposes a novel approach inspired by pixel distribution learning techniques to enhance automated garbage classification. The method aims to address limitations of conventional convolutional neural network (CNN)-based approaches, including computational complexity and vulnerability to image variations. We will conduct experiments using the Kaggle Garbage Classification dataset, comparing our approach with existing models to demonstrate the strength and efficiency of pixel distribution learning in automated garbage classification technologies.
Compared to the more commonly used time-based profiling, allocation profiling provides an alternate view of the execution of allocation heavy dynamically typed languages. However, profiling every single allocation in a program is very inefficient. We present a sampling allocation profiler that is deeply integrated into the garbage collector of PyPy, a Python virtual machine. This integration ensures tunable low overhead for the allocation profiler, which we measure and quantify. Enabling allocation sampling profiling with a sampling period of 4 MB leads to a maximum time overhead of 25% in our benchmarks, over un-profiled regular execution.
This paper presents a novel garbage pickup robot which operates on the grass. The robot is able to detect the garbage accurately and autonomously by using a deep neural network for garbage recognition. In addition, with the ground segmentation using a deep neural network, a novel navigation strategy is proposed to guide the robot to move around. With the garbage recognition and automatic navigation functions, the robot can clean garbage on the ground in places like parks or schools efficiently and autonomously. Experimental results show that the garbage recognition accuracy can reach as high as 95%, and even without path planning, the navigation strategy can reach almost the same cleaning efficiency with traditional methods. Thus, the proposed robot can serve as a good assistance to relieve dustman's physical labor on garbage cleaning tasks.
Garbage production and littering are persistent global issues that pose significant environmental challenges. Despite large-scale efforts to manage waste through collection and sorting, existing approaches remain inefficient, leading to inadequate recycling and disposal. Therefore, developing advanced AI-based systems is less labor intensive approach for addressing the growing waste problem more effectively. These models can be applied to sorting systems or possibly waste collection robots that may produced in the future. AI models have grown significantly at identifying objects through object detection. This paper reviews the implementation of AI models for classifying trash through object detection, specifically focusing on using YOLO V5 for training and testing. The study demonstrates how YOLO V5 can effectively identify various types of waste, including plastic, paper, glass, metal, cardboard, and biodegradables.
State Machine Replication (SMR) protocols form the backbone of many distributed systems. Enterprises and startups increasingly build their distributed systems on the cloud due to its many advantages, such as scalability and cost-effectiveness. One of the first technical questions companies face when building a system on the cloud is which programming language to use. Among many factors that go into this decision is whether to use a language with garbage collection (GC), such as Java or Go, or a language with manual memory management, such as C++ or Rust. Today, companies predominantly prefer languages with GC, like Go, Kotlin, or even Python, due to ease of development; however, there is no free lunch: GC costs resources (memory and CPU) and performance (long tail latencies due to GC pauses). While there have been anecdotal reports of reduced cloud cost and improved tail latencies when switching from a language with GC to a language with manual memory management, so far, there has not been a systematic study of the GC overhead of running an SMR-based cloud system. This paper studies the overhead of running an SMR-based cloud system written in a language with GC. To this end, we des
Waste management is one of the significant problems throughout the world. Contemporaneous methods find it difficult to manage the volume of solid waste generated by the growing urban population. In this paper, we propose a system which is very hygienic and cheap that uses Artificial Intelligence algorithms for detection of the garbage. Once the garbage is detected the system calculates the position of the garbage by the use of the camera only. The proposed system is capable of distinguishing between valuables and garbage with more than 95% confidence in real-time. Finally, a robotic arm controlled by the microcontroller is used to pick up the garbage and places it in the bin. Concluding, the paper explains a system that is capable of working as a human in terms of inspecting and collecting the garbage. The system is able to achieve 3-4 frames per second on the Raspberry Pi, capable of detecting the garbage in real-time with 90%+ confidence.
In order to solve the recent defect in garbage classification - including low level of intelligence, low accuracy and high cost of equipment, this paper presents a series of methods in identification and judgment in intelligent garbage classification, including a material identification based on thermal principle and non-destructive laser irradiation, another material identification based on optical diffraction and phase analysis, a profile identification which utilizes a scenery thermal image after PCA and histogram correction, another profile identification which utilizes computer vision with innovated data sets and algorithms. Combining AHP and Bayesian formula, the paper innovates a coupling algorithm which helps to make a comprehensive judgment of the garbage sort, based on the material and profile identification. This paper also proposes a method for real-time space measurement of garbage cans, which based on the characteristics of air as fluid, and analyses the functions of air cleaning and particle disposing. Instead of the single use of garbage image recognition, this paper provides a comprehensive method to judge the garbage sort by material and profile identifications, w
A widespread practice to implement a flexible array is to consider the storage area into two parts: the used area, which is already available for read/write operations, and the supply area, which is used in case of enlargement of the array. The main purpose of the supply area is to avoid as much as possible the reallocation of the whole storage area in case of enlargement. As the supply area is not used by the application, the main idea of the paper is to convey the information to the garbage collector, making it possible to avoid completely the marking of the supply area. We also present a simple method to analyze the types of objects, which are stored in an array as well as the possible presence of NULL values within the array. This allows us to better specialize the work of the garbage collector when marking the used area, and also, by transitivity, to improve overall results for type analysis of all expressions of the source code. After introducing several abstract data types, which represent the main arrays concerned by our technique (i.e., zero or variable indexing, circular arrays and hash maps), we measure its impact during the bootstrap of two compilers whose libraries are
The transformation of graphs and graph-like structures is ubiquitous in computer science. When a system is described by graph-transformation rules, it is often desirable that the rules are both terminating and confluent so that rule applications in an arbitrary order produce unique resulting graphs. However, there are application scenarios where the rules are not globally confluent but confluent on a subclass of graphs that are of interest. In other words, non-resolvable conflicts can only occur on graphs that are considered as "garbage". In this paper, we introduce the notion of confluence up to garbage and generalise Plump's critical pair lemma for double-pushout graph transformation, providing a sufficient condition for confluence up to garbage by non-garbage critical pair analysis. We apply our results in two case studies about efficient language recognition: we present backtracking-free graph reduction systems which recognise a class of flow diagrams and a class of labelled series-parallel graphs, respectively. Both systems are non-confluent but confluent up to garbage. We also give a critical pair condition for subcommutativity up to garbage which, together with closedness, i