Principled Bayesian inference of galaxy properties has not previously been performed for wide-area weak lensing surveys with millions of sources. We address this gap by applying the pop-cosmos generative model to perform spectral energy distribution (SED) fitting for 4 million KiDS-1000 galaxies. Calibrated on deep COSMOS2020 photometric data, pop-cosmos specifies a physically-motivated prior over the galaxy population up to $z \simeq 6$ in stellar population synthesis (SPS) parameter space. Using the Speculator SPS emulator with GPU-accelerated MCMC sampling, we perform full posterior inference at 6.5 GPU seconds per galaxy, obtaining joint constraints on galaxy redshifts and physical properties. We validate photometric redshifts against $\sim\!185,\!000$ KiDS galaxies cross-matched to DESI DR1 spectroscopic samples, achieving low bias ($3\times10^{-3}$), scatter ($σ_{\mathrm{MAD}}=0.04$), and outlier fraction (3.7%) for the Bright Galaxy Survey, with comparable performance (bias $3\times10^{-2}$, $σ_{\mathrm{MAD}}=0.05$, 1.3% outliers) for luminous red galaxies (LRGs). Within the LRG sample, we identify massive, dusty, star-forming contaminants at $z \simeq 0.4$ satisfying standa
Fine-grained visual categorization (FGVC) is a challenging but significant task in computer vision, which aims to recognize different sub-categories of birds, cars, airplanes, etc. Among them, recognizing models of different cars has significant application value in autonomous driving, traffic surveillance and scene understanding, which has received considerable attention in the past few years. However, Stanford-Car, the most widely used fine-grained dataset for car recognition, only has 196 different categories and only includes vehicle models produced earlier than 2013. Due to the rapid advancements in the automotive industry during recent years, the appearances of various car models have become increasingly intricate and sophisticated. Consequently, the previous Stanford-Car dataset fails to capture this evolving landscape and cannot satisfy the requirements of automotive industry. To address these challenges, in our paper, we introduce Car-1000, a large-scale dataset designed specifically for fine-grained visual categorization of diverse car models. Car-1000 encompasses vehicles from 166 different automakers, spanning a wide range of 1000 distinct car models. Additionally, we h
Lanthanides are nowadays extensively used to investigate the properties of strongly correlated matter. Nevertheless, exploiting the Zeeman manifold of a lanthanide atom ground state is challenging due to the unavoidable presence of depolarization collisions. Here we demonstrate that in the case of the thulium atom, it is possible to suppress this depolarization by a factor of 1000 with a carefully tuned magnetic field thus opening the way for the efficient use of the Zeeman manifold in quantum simulations.
We present a culturally-grounded multimodal dataset of 1,060 traditional recipes crowdsourced from rural communities across remote regions of Eastern India, spanning 10 endangered languages. These recipes, rich in linguistic and cultural nuance, were collected using a mobile interface designed for contributors with low digital literacy. Endangered Language Recipes (ELR)-1000 -- captures not only culinary practices but also the socio-cultural context embedded in indigenous food traditions. We evaluate the performance of several state-of-the-art large language models (LLMs) on translating these recipes into English and find the following: despite the models' capabilities, they struggle with low-resource, culturally-specific language. However, we observe that providing targeted context -- including background information about the languages, translation examples, and guidelines for cultural preservation -- leads to significant improvements in translation quality. Our results underscore the need for benchmarks that cater to underrepresented languages and domains to advance equitable and culturally-aware language technologies. As part of this work, we release the ELR-1000 dataset to the
We present a detailed analysis of the end-user performance of the metrological signal at 1542 nm disseminated by the French national fibre network Refimeve, about 1000 km from the source. By the mean of a local ultrastable laser at 729 nm and a frequency comb, we are able to carry out stability and phase noise measurements of the signal with respect to the local laser. With a focus on phase noise analysis we identify different timescales of interest for the use of this signal in optical frequency metrology.
Large multimodal models (LMMs) have achieved impressive progress in vision-language understanding, yet they face limitations in real-world applications requiring complex reasoning over a large number of images. Existing benchmarks for multi-image question-answering are limited in scope, each question is paired with only up to 30 images, which does not fully capture the demands of large-scale retrieval tasks encountered in the real-world usages. To reduce these gaps, we introduce two document haystack benchmarks, dubbed DocHaystack and InfoHaystack, designed to evaluate LMM performance on large-scale visual document retrieval and understanding. Additionally, we propose V-RAG, a novel, vision-centric retrieval-augmented generation (RAG) framework that leverages a suite of multimodal vision encoders, each optimized for specific strengths, and a dedicated question-document relevance module. V-RAG sets a new standard, with a 9% and 11% improvement in Recall@1 on the challenging DocHaystack-1000 and InfoHaystack-1000 benchmarks, respectively, compared to the previous best baseline models. Additionally, integrating V-RAG with LMMs enables them to efficiently operate across thousands of im
Ferroelectric (Fe) materials-based devices show great promise for non-volatile memory applications, yet few demonstrate reliable operation at elevated temperatures. In this work, we demonstrate Ni/Al0.68Sc0.32N/4H-SiC metal-ferroelectric-semiconductor capacitors for high-temperature non-volatile memory applications. Our 30-nm thick ferroelectric Al0.68Sc0.32N film grown on SiC exhibits stable and robust ferroelectric switching up to 1000°C. The coercive field decreases linearly from -6.4/+11.9 MV cm-1 at room temperature to -3.1/+7.8 MV cm-1 at 800°C, further reducing to -2.5 MV cm-1 at 1000°C. At 600°C, the devices achieve remarkable reliability with ~2000 endurance cycles and over at least 100 hours of retention with negligible polarization loss. At 800°C, the devices retain data for at least 10,000 seconds and exceed 400 write cycles. Our results further highlight the potential for ferroelectric AlScN thin-films particularly when paired with SiC semiconductor substrates for high-temperature non-volatile memory.
Aluminum scandium nitride (AlScN) has emerged as a highly promising material for high-temperature applications due to its robust piezoelectric, ferroelectric, and dielectric properties. This study investigates the behavior of Al0.7Sc0.3N thin films in extreme thermal environments, demonstrating functional stability up to 1000°C, making it suitable for use in aerospace, hypersonics, deep-well, and nuclear reactor systems. Tantalum silicide (TaSi2)/Al0.7Sc0.3N/TaSi2 capacitors were fabricated and characterized across a wide temperature range, revealing robust ferroelectric and dielectric properties, along with significant enhancement in piezoelectric performance. At 1000°C, the ferroelectric hysteresis loops showed a substantial reduction in coercive field from 4.3 MV/cm to 1.2 MV/cm, while the longitudinal piezoelectric coefficient increased nearly tenfold, reaching 75.1 pm/V at 800°C. Structural analysis via scanning and transmission electron microscopy confirmed the integrity of the TaSi2/Al0.7Sc0.3N interfaces, even after exposure to extreme temperatures. Furthermore, the electromechanical coupling coefficient was calculated to increase by over 500%, from 12.9% at room temperatur
Team projects in Computer Science (CS) help students build collaboration skills, apply theory, and prepare for real-world software development. Online classes present unique opportunities to transform the accessibility of CS education at scale. Still, the geographical distribution of students and staff adds complexity to forming effective teams, providing consistent feedback, and facilitating peer interactions. We discuss our approach of managing, evaluating, and providing constructive feedback to over 200 project teams, comprising 1000+ graduate students distributed globally, two professors, and 25+ teaching assistants. We deployed and iteratively refined this approach over 10 years while offering the Data and Visual Analytics course (CSE 6242) at Georgia Institute of Technology. Our approach and insights can help others striving to make CS education accessible, especially in online and large-scale settings.
For next-generation neutrinoless double beta decay experiments, extremely low backgrounds are necessary. An understanding of in-situ cosmogenic backgrounds is critical to the design effort. In-situ cosmogenic backgrounds impose a depth requirement and especially impact the choice of host laboratory. Often, simulations are used to understand background effects, and these simulations can have large uncertainties. One way to characterize the systematic uncertainties is to compare unalike simulation programs. In this paper, a suite of neutron simulations with identical geometries and starting parameters have been performed with Geant4 and MCNP, using geometries relevant to the LEGEND-1000 experiment. This study is an important step in gauging the uncertainties of simulations-based estimates. To reduce project risks associated with simulation uncertainties, a novel alternative shield of methane-doped liquid argon is considered in this paper for LEGEND-1000, which could achieve large background reduction without requiring significant modification to the baseline design.
This paper performs the first cosmological parameter analysis of the KiDS-1000 data with second- and third-order shear statistics. This work builds on a series of papers that describe the roadmap to third-order shear statistics. We derive and test a combined model of the second-order shear statistic, namely the COSEBIs and the third-order aperture mass statistics $\langle M_\mathrm{ap}^3\rangle$ in a tomographic set-up. We validate our pipeline with $N$-body simulations that mock the fourth Kilo Degree survey data release. To model the second- and third-order statistics, we use the latest version of \textsc{HMcode2020} for the power spectrum and \textsc{BiHalofit} for the bispectrum. Furthermore, we use an analytic description to model intrinsic alignments and hydro-dynamical simulations to model the effect of baryonic feedback processes. Lastly, we decreased the dimension of the data vector significantly by considering for the $\langle M_\mathrm{ap}^3\rangle$ part of the data vector only equal smoothing radii, making a data analysis of the fourth Kilo Degree survey data release using a combined analysis of COSEBIs third-order shear statistic possible. We first validate the accurac
In the past decade, advances in deep learning have resulted in breakthroughs in a variety of areas, including computer vision, natural language understanding, speech recognition, and reinforcement learning. Specialized, high-performing neural architectures are crucial to the success of deep learning in these areas. Neural architecture search (NAS), the process of automating the design of neural architectures for a given task, is an inevitable next step in automating machine learning and has already outpaced the best human-designed architectures on many tasks. In the past few years, research in NAS has been progressing rapidly, with over 1000 papers released since 2020 (Deng and Lindauer, 2021). In this survey, we provide an organized and comprehensive guide to neural architecture search. We give a taxonomy of search spaces, algorithms, and speedup techniques, and we discuss resources such as benchmarks, best practices, other surveys, and open-source libraries.
We present refined cosmological parameter constraints derived from a cosmic shear analysis of the fourth data release of the Kilo-Degree Survey (KiDS-1000). Our main improvements include enhanced galaxy shape measurements made possible by an updated version of the lensfit code and improved shear calibration achieved with a newly developed suite of multi-band image simulations. Additionally, we incorporated recent advancements in cosmological inference from the joint Dark Energy Survey Year 3 and KiDS-1000 cosmic shear analysis. Assuming a spatially flat standard cosmological model, we constrain $S_8\equivσ_8(Ω_{\rm m}/0.3)^{0.5} = 0.776_{-0.027-0.003}^{+0.029+0.002}$, where the second set of uncertainties accounts for the systematic uncertainties within the shear calibration. These systematic uncertainties stem from minor deviations from realism in the image simulations and the sensitivity of the shear measurement algorithm to the morphology of the galaxy sample. Despite these changes, our results align with previous KiDS studies and other weak lensing surveys, and we find a ${\sim}2.3σ$ level of tension with the Planck cosmic microwave background constraints on $S_8$.
Sapphire fiber can withstand around 2000°C, but it is multimoded, giving poor precision sensors. We demonstrate a single-mode sapphire fiber Bragg grating temperature sensor operating up to 1200°C. The repeatability above 1000°C is within {\pm}0.08%.
We present a joint cosmic shear analysis of the Dark Energy Survey (DES Y3) and the Kilo-Degree Survey (KiDS-1000) in a collaborative effort between the two survey teams. We find consistent cosmological parameter constraints between DES Y3 and KiDS-1000 which, when combined in a joint-survey analysis, constrain the parameter $S_8 = σ_8 \sqrt{Ω_{\rm m}/0.3}$ with a mean value of $0.790^{+0.018}_{-0.014}$. The mean marginal is lower than the maximum a posteriori estimate, $S_8=0.801$, owing to skewness in the marginal distribution and projection effects in the multi-dimensional parameter space. Our results are consistent with $S_8$ constraints from observations of the cosmic microwave background by Planck, with agreement at the $1.7σ$ level. We use a Hybrid analysis pipeline, defined from a mock survey study quantifying the impact of the different analysis choices originally adopted by each survey team. We review intrinsic alignment models, baryon feedback mitigation strategies, priors, samplers and models of the non-linear matter power spectrum.
We present a cosmic shear analysis with an improved redshift calibration for the fourth data release of the Kilo-Degree Survey (KiDS-1000) using self-organising maps (SOMs). Compared to the previous analysis of the KiDS-1000 data, we expand the redshift calibration sample to more than twice its size, now consisting of data of 17 spectroscopic redshift campaigns, and significantly extending the fraction of KiDS galaxies we are able to calibrate with our SOM redshift methodology. We then enhance the calibration sample with precision photometric redshifts from COSMOS2015 and the Physics of the Accelerated Universe Survey (PAUS), allowing us to fill gaps in the spectroscopic coverage of the KiDS data. Finally we perform a Complete Orthogonal Sets of E/B-Integrals (COSEBIs) cosmic shear analysis of the newly calibrated KiDS sample. We find $S_8 = 0.748_{-0.025}^{+0.021}$, which is in good agreement with previous KiDS studies and increases the tension with measurements of the cosmic microwave background to 3.4σ. We repeat the redshift calibration with different subsets of the full calibration sample and obtain, in all cases, agreement within at most 0.5σ in $S_8$ compared to our fiducial
The 1000 Genomes Project provides sequencing data on 3,202 samples from 26 populations spanning five continental regions with no access or use restrictions. The kgp R package provides consistent and comprehensive metadata about samples and populations in the 1000 Genomes Project and other population sequencing data in the International Genome Sample Resource collection. The kgp package is distributed via the Comprehensive R Archive Network (CRAN) at https://cran.r-project.org/package=kgp. Source code is available on GitHub at https://github.com/stephenturner/kgp. Further documentation is online at https://stephenturner.github.io/kgp/.
There is a widespread assumption that the peak velocities of visually guided saccades in the dark are up to 10~\% slower than those made in the light. Studies that questioned the impact of the surrounding brightness conditions, come to differing conclusions, whether they have an influence or not and if so, in which manner. The problem is of a complex nature as the illumination condition itself may not contribute to different measured peak velocities solely but in combination with the estimation of the pupil size due to its deformation during saccades or different gaze positions. Even the measurement technique of video-based eye tracking itself could play a significant role. To investigate this issue, we constructed a stepper motor driven artificial eye with fixed pupil size to mimic human saccades with predetermined peak velocity \& amplitudes under three different brightness conditions with the EyeLink 1000, one of the most common used eye trackers. The aim was to control the pupil and brightness. With our device, an overall good accuracy and precision of the EyeLink 1000 could be confirmed. Furthermore, we could find that there is no artifact for pupil based eye tracking in r
We introduce DS-1000, a code generation benchmark with a thousand data science problems spanning seven Python libraries, such as NumPy and Pandas. Compared to prior works, DS-1000 incorporates three core features. First, our problems reflect diverse, realistic, and practical use cases since we collected them from StackOverflow. Second, our automatic evaluation is highly specific (reliable) -- across all Codex-002-predicted solutions that our evaluation accept, only 1.8% of them are incorrect; we achieve this with multi-criteria metrics, checking both functional correctness by running test cases and surface-form constraints by restricting API usages or keywords. Finally, we proactively defend against memorization by slightly modifying our problems to be different from the original StackOverflow source; consequently, models cannot answer them correctly by memorizing the solutions from pre-training. The current best public system (Codex-002) achieves 43.3% accuracy, leaving ample room for improvement. We release our benchmark at https://ds1000-code-gen.github.io.