共找到 20 条结果
Score big savings on Altra Running shoes with up to 50% off sale styles, 20% off select models, 10% off your first order when you sign up, plus free standard delivery on every purchase
Application performance of modern day processors is often limited by the memory subsystem rather than actual compute capabilities. Therefore, data throughput specifications play a key role in modeling application performance and determining possible bottlenecks. However, while peak instruction throughputs and bandwidths for local caches are often documented, the achievable throughput can also depend on the relation between memory access and compute instructions. In this paper, we present an Arm version of the well established x86-membench throughput benchmark, which we have adapted to support all current SIMD extensions of the Armv8 instruction set architecture. We describe aspects of the Armv8 ISA that need to be considered in the portable design of this benchmark. We use the benchmark to analyze the memory subsystem at a fine spatial granularity and to unveil microarchitectural details of three processors: Fujitsu A64FX, Ampere Altra and Cavium ThunderX2. Based on the resulting performance information, we show that instruction fetch and decoder widths become a potential bottleneck for cache-bandwidth-sensitive workloads due to the load-store concept of the Arm ISA.
In this paper, we evaluate the portability of the SYCL programming model on some of the latest CPUs and GPUs from a wide range of vendors, utilizing the two main compilers: DPC++ and hipSYCL/OpenSYCL. Both compilers currently support GPUs from all three major vendors; we evaluate performance on the Intel(R) Data Center GPU Max 1100, the NVIDIA A100 GPU, and the AMD MI250X GPU. Support on CPUs currently is less established, with DPC++ only supporting x86 CPUs through OpenCL, however, OpenSYCL does have an OpenMP backend capable of targeting all modern CPUs; we benchmark the Intel Xeon Platinum 8360Y Processor (Ice Lake), the AMD EPYC 9V33X (Genoa-X), and the Ampere Altra platforms. We study a range of primarily bandwidth-bound applications implemented using the OPS and OP2 DSLs, evaluate different formulations in SYCL, and contrast their performance to "native" programming approaches where available (CUDA/HIP/OpenMP). On GPU architectures SCYL on average even slightly outperforms native approaches, while on CPUs it falls behind - highlighting a continued need for improving CPU performance. While SYCL does not solve all the challenges of performance portability (e.g. needing differen
Today the LHC offline computing relies heavily on CPU resources, despite the interest in compute accelerators, such as GPUs, for the longer term future. The number of cores per CPU socket has continued to increase steadily, reaching the levels of 64 cores (128 threads) with recent AMD EPYC processors, and 128 cores on Ampere Altra Max ARM processors. Over the course of the past decade, the CMS data processing framework, CMSSW, has been transformed from a single-threaded framework into a highly concurrent one. The first multithreaded version was brought into production by the start of the LHC Run 2 in 2015. Since then, the framework's threading efficiency has gradually been improved by adding more levels of concurrency and reducing the amount of serial code paths. The latest addition was support for concurrent Runs. In this work we review the concurrency model of the CMSSW, and measure its scalability with real CMS applications, such as simulation and reconstruction, on mode rn many-core machines. We show metrics such as event processing throughput and application memory usage with and without the contribution of I/O, as I/O has been the major scaling limitation for the CMS applicat
NASA’s Roman Space Telescope could revolutionize the search for alien worlds by discovering around 100,000 exoplanets—far more than all previous missions combined。 It will look deep into unexplored parts of the Milky Way, helping scientists compare planetary systems across very different galactic environments。 The mission will also uncover rare Ear
June's night sky delivers several must-see events, starting with a close encounter between Venus and Jupiter after sunset。 Mercury joins the pair to form a rare three-planet lineup, while the Moon puts on a special show by passing in front of Venus for viewers in parts of the Americas。 The month also marks the start of astronomical summer and the r
Scientists have developed an artificial photosynthesis system that essentially regulates itself, eliminating the need for batteries used in many current designs。 The key innovation is an electrolyzer that automatically adapts to changing sunlight by altering its electrical properties as it heats up。 This keeps solar fuel production more stable whil
NASA's James Webb Space Telescope has uncovered unusual chemistry in interstellar comet 3I/ATLAS, including the first direct detection of methane on a visitor from another star system。 The comet also contains exceptionally high levels of carbon dioxide, making it unlike most comets born in our solar system。 Scientists believe the methane was hidden
Scientists at the University of Hong Kong have created a remarkable new type of brain-inspired chip that can function just above absolute zero, one of the coldest environments imaginable。 By using a standard silicon carbide transistor in a completely new way, the team made a single device behave like an energy-efficient neuron, firing electrical “s
NASA’s futuristic X-59 jet is about to face its biggest challenge yet: breaking the sound barrier for the first time。 After a successful series of test flights that pushed the aircraft to near-supersonic speeds, engineers are preparing to fly it faster than Mach 1 and eventually up to Mach 1。6 at 60,000 feet
A team at the University of Chicago has discovered a surprisingly simple way to create powerful quantum states that are normally difficult to produce。 By making small adjustments to the energy levels of atoms inside an optical cavity, researchers can generate a wide variety of highly entangled states without adding complicated hardware
Researchers gave top AI models a classic attention test used in psychology and found a major flaw。 While the models could correctly name colors in short lists, their performance deteriorated sharply as the task became longer and more complex。 Some leading systems fell from over 90% accuracy to nearly complete failure
The mysterious Amaterasu particle may not be a proton at all。 New research suggests that some of the most extreme cosmic rays could be ultraheavy atomic nuclei, heavier than iron, which are better able to retain their energy while traveling through space。 This idea could help explain how these rare particles reach Earth and provide new clues about
Scientists have proposed a new method for finding tightly bound supermassive black hole pairs by searching for stars that flash repeatedly as their light is magnified by the black holes’ gravity。 The timing and brightness of these bursts could provide a unique fingerprint of black holes slowly spiraling toward a future collision
A lightweight new X-ray telescope could finally give scientists something they’ve never had before: a complete chemical map of the Moon。 Researchers used detailed mission simulations to show that a compact telescope orbiting the Moon could identify key elements across the entire lunar surface, helping reveal how the Moon formed and evolved
Scientists discovered that rice behaves in a highly unusual way: it weakens under rapid compression but stays stronger when pressure is applied slowly。 Using this effect, they engineered a new material that reacts differently to gentle movements and sudden impacts。 The material can adapt its stiffness automatically, opening the door to safer soft r