Automated environment configuration is a critical bottleneck in scaling software engineering (SWE) automation. To provide a reliable evaluation standard for this task, we present Multi-Docker-Eval benchmark. It includes 40 real-world repositories spanning 9 programming languages and measures both success in achieving executable states and efficiency under realistic constraints. Our extensive evaluation of state-of-the-art LLMs and agent frameworks reveals key insights: (1) the overall success rate of current models is low (F2P at most 37.7%), with environment construction being the primary bottleneck; (2) model size and reasoning length are not decisive factors, and open-source models like DeepSeek-V3.1 and Kimi-K2 are competitive in both efficiency and effectiveness; (3) agent framework and programming language also have significantly influence on success rate. These findings provide actionable guidelines for building scalable, fully automated SWE pipelines.
Paired image-text data with subtle variations in-between (e.g., people holding surfboards vs. people holding shovels) hold the promise of producing Vision-Language Models with proper compositional understanding. Synthesizing such training data from generative models is a highly coveted prize due to the reduced cost of data collection. However, synthesizing training images for compositional learning presents three challenges: (1) efficiency in generating large quantities of images, (2) text alignment between the generated image and the caption in the exact place of the subtle change, and (3) image fidelity in ensuring sufficient similarity with the original real images in all other places. We propose SPARCL (Synthetic Perturbations for Advancing Robust Compositional Learning), which integrates image feature injection into a fast text-to-image generative model, followed by an image style transfer step, to meet the three challenges. Further, to cope with any residual issues of text alignment, we propose an adaptive margin loss to filter out potentially incorrect synthetic samples and focus the learning on informative hard samples. Evaluation on four compositional understanding benchma
Construction sites frequently require removing large rocks before excavation or grading can proceed. Human operators typically extract these boulders using only standard digging buckets, avoiding time-consuming tool changes to specialized grippers. This task demands manipulating irregular objects with unknown geometries in harsh outdoor environments where dust, variable lighting, and occlusions hinder perception. The excavator must adapt to varying soil resistance--dragging along hard-packed surfaces or penetrating soft ground--while coordinating multiple hydraulic joints to secure rocks using a shovel. Current autonomous excavation focuses on continuous media (soil, gravel) or uses specialized grippers with detailed geometric planning for discrete objects. These approaches either cannot handle large irregular rocks or require impractical tool changes that interrupt workflow. We train a reinforcement learning policy in simulation using rigid-body dynamics and analytical soil models. The policy processes sparse LiDAR points (just 20 per rock) from vision-based segmentation and proprioceptive feedback to control standard excavator buckets. The learned agent discovers different strate
This paper explores the question of creating and maintaining terrain maps in environments where the terrain changes. The specific example explored is the construction of terrain maps from 3D LiDAR measurements on an electric rope shovel. The approach extends the height grid representation of terrain to include a Hidden Markov Model in each cell, enabling confidence-based mapping of constantly changing terrain. There are inherent difficulties in this problem, including semantic labelling of the LiDAR measurements associated with machinery and determining the pose of the sensor. Solutions to both of these problems are explored. The significance of this work lies in the need for accurate terrain mapping to support autonomous machine operation.
In a post-industrial society, the workplace is dominated primarily by Knowledge Work, which is achieved mostly through human cognitive processing, such as analysis, comprehension, evaluation, and decision-making. Many of these processes have limited support from technology in the same way that physical tasks have been enabled through a host of tools from hammers to shovels and hydraulic lifts. To develop a suite of cognitive tools, we first need to understand which processes humans use to complete work tasks. In the past century several classifications (e.g., Blooms) of cognitive processes have emerged, and we assessed their viability as the basis for designing tools that support cognitive work. This study re-used an existing data set composed of interviews of environmental scientists about their core work. While the classification uncovered many instances of cognitive process, the results showed that the existing cognitive process classifications do not provide a sufficiently comprehensive deconstruction of the human cognitive processes; the work is quite simply too abstract to be operational.
We consider energy-dispersive X-ray Fluorescence (EDXRF) applications where the fundamental parameters method is impractical such as when instrument parameters are unavailable. For example, on a mining shovel or conveyor belt, rocks are constantly moving (leading to varying angles of incidence and distances) and there may be other factors not accounted for (like dust). Neural networks do not require instrument and fundamental parameters but training neural networks requires XRF spectra labelled with elemental composition, which is often limited because of its expense. We develop a neural network model that learns from limited labelled data and also benefits from domain knowledge by learning to invert a forward model. The forward model uses transition energies and probabilities of all elements and parameterized distributions to approximate other fundamental and instrument parameters. We evaluate the model and baseline models on a rock dataset from a lithium mineral exploration project. Our model works particularly well for some low-Z elements (Li, Mg, Al, and K) as well as some high-Z elements (Sn and Pb) despite these elements being outside the suitable range for common spectromete
Beam profile engineering, where a desired optical intensity distribution can be generated by an array of phase shifting (or amplitude changing) elements is a promising approach in laser material processing. For example, a spatial light modulator (SLM) is a dynamic diffractive optical element allowing for experimental implementations of controllable beam profile. Scalar Mathieu beams have elliptical intensity distribution perceivable as optical knives in the transverse plane and scalar Weber beams have a parabolic distribution, which enables us to call them optical shovels. Here, we introduce vector versions of scalar Mathieu and Weber beams and use those vector beams as a basis to construct controllable on-axis phase and amplitude distributions with polarization control. Further, we generate individual components of optical knife and shovel beams experimentally using SLMs as a toy model and report on our achievements in the control over the beam shape, dimensions and polarization along the propagation axis.
A new study suggests Earth may have been sending tiny hitchhikers to Venus for billions of years。 Researchers found that asteroid impacts could launch microbes into space, where some might survive the journey and end up suspended in Venus' clouds。 If future missions detect life there, there's a surprising chance it didn't originate on Venus at all—
A new AI-powered framework could transform how astronomers measure the expansion of the Universe。 By analyzing images of Type Ia supernovae and modeling their environments in unprecedented detail, researchers can estimate cosmic distances with near-spectroscopic accuracy。 The technique is designed for the flood of data expected from the upcoming Ve
A new SETI study suggests we may be overlooking alien signals not because they aren't there, but because their own stars are scrambling them before they escape into space。 Turbulent plasma and powerful stellar storms can spread an ultra-narrow radio transmission across a wider range of frequencies, making it much harder for traditional searches to
Researchers developed a Wordle-solving strategy that succeeds 99% of the time by focusing on information gain rather than likely answers。 The method uses Shannon entropy to identify guesses that reveal the most about the hidden word。 Each guess is designed to slash uncertainty and narrow the possibilities faster
Astronomers may be closing in on a long-standing cosmic mystery: why some of the universe’s biggest galaxies seem to have far fewer stars than expected。 Using NASA- and JAXA-supported XRISM observations of a galaxy called NGC 4151, researchers found strong evidence that supermassive black holes can unleash powerful winds that blow away the raw mate
Researchers found that a Chinese sodium-ion battery performs far better than expected, with production quality and design features comparable to Tesla’s batteries。 If engineers can improve cold-weather charging and energy density, sodium could become a cheaper and more abundant alternative to lithium for EVs and large-scale energy storage
A new theory suggests the universe is constantly recording its own history in the fabric of spacetime。 If correct, this cosmic memory could help solve some of the biggest puzzles in physics, from black holes to dark matter and the universe’s ultimate fate
A rare meteorite has revealed evidence of a massive lost world that once orbited the young Sun before being destroyed in a catastrophic collision。 The discovery suggests some early planets formed from dramatically different materials than Earth and Mars, rewriting part of the solar system’s origin story
What if consciousness isn’t limited to brains like ours。 Philosophers Eric Schwitzgebel and Jeremy Pober argue that consciousness could arise in many different forms of life, even in beings built from radically different materials than those found on Earth。 Drawing on the vastness of the universe and the likely existence of countless alien civiliza