共找到 20 条结果
The tech leaders, with combined net worths exceeding $670 billion, have brought props to court and traded icy stares as their legal dispute reaches a denouement
Instruction tuning large language models (LLMs) using machine-generated instruction-following data has improved zero-shot capabilities on new tasks, but the idea is less explored in the multimodal field. In this paper, we present the first attempt to use language-only GPT-4 to generate multimodal language-image instruction-following data. By instruction tuning on such generated data, we introduce LLaVA: Large Language and Vision Assistant, an end-to-end trained large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding.Our early experiments show that LLaVA demonstrates impressive multimodel chat abilities, sometimes exhibiting the behaviors of multimodal GPT-4 on unseen images/instructions, and yields a 85.1% relative score compared with GPT-4 on a synthetic multimodal instruction-following dataset. When fine-tuned on Science QA, the synergy of LLaVA and GPT-4 achieves a new state-of-the-art accuracy of 92.53%. We make GPT-4 generated visual instruction tuning data, our model and code base publicly available.
OpenAI has recently released GPT-4 (a.k.a. ChatGPT plus), which is demonstrated to be one small step for generative AI (GAI), but one giant leap for artificial general intelligence (AGI). Since its official release in November 2022, ChatGPT has quickly attracted numerous users with extensive media coverage. Such unprecedented attention has also motivated numerous researchers to investigate ChatGPT from various aspects. According to Google scholar, there are more than 500 articles with ChatGPT in their titles or mentioning it in their abstracts. Considering this, a review is urgently needed, and our work fills this gap. Overall, this work is the first to survey ChatGPT with a comprehensive review of its underlying technology, applications, and challenges. Moreover, we present an outlook on how ChatGPT might evolve to realize general-purpose AIGC (a.k.a. AI-generated content), which will be a significant milestone for the development of AGI.
We present a vision and language model named MultiModal-GPT to conduct multi-round dialogue with humans. MultiModal-GPT can follow various instructions from humans, such as generating a detailed caption, counting the number of interested objects, and answering general questions from users. MultiModal-GPT is parameter-efficiently fine-tuned from OpenFlamingo, with Low-rank Adapter (LoRA) added both in the cross-attention part and the self-attention part of the language model. We first construct instruction templates with vision and language data for multi-modality instruction tuning to make the model understand and follow human instructions. We find the quality of training data is vital for the dialogue performance, where few data containing short answers can lead the model to respond shortly to any instructions. To further enhance the ability to chat with humans of the MultiModal-GPT, we utilize language-only instruction-following data to train the MultiModal-GPT jointly. The joint training of language-only and visual-language instructions with the \emph{same} instruction template effectively improves dialogue performance. Various demos show the ability of continuous dialogue of MultiModal-GPT with humans. Code, dataset, and demo are at https://github.com/open-mmlab/Multimodal-GPT
The type of bar matters when it comes to how it bends and recoils, but why is still a mystery
A new quantum physics study reveals that simply changing a magnetic field over time can unlock entirely new forms of matter that don’t exist under normal conditions。 By carefully “driving” materials with timed magnetic shifts, researchers created exotic quantum states that could be far more stable and resistant to errors—one of the biggest challeng
Cinematographer Hillary Fyfe Spera on how she kept things visually fresh for Born Again’s second season
Astronomers have spotted something surprising in the far outer Solar System—a faint, short-lived atmosphere clinging to a tiny icy world that shouldn’t be able to hold one at all。 The object, called 2002 XV93, is far smaller than Pluto, yet observations during a rare stellar alignment revealed its presence through a subtle dimming of starlight。 Eve
A mysterious comet from beyond our solar system is giving astronomers a rare glimpse into alien worlds — and it may have formed in a place far colder and stranger than anything around our Sun。 The interstellar visitor, called 3I/ATLAS, contains an astonishingly high amount of “heavy water,” far exceeding anything seen in our own solar system
A scorching, airless world just 48 light-years away is offering scientists a rare glimpse into the geology of distant planets。 Using the James Webb Space Telescope, researchers studied LHS 3844 b—a tidally locked “super-Earth” with a permanent dayside hot enough to melt metal—and discovered it’s a dark, barren rock with no atmosphere
A bizarre planetary pairing 190 light-years away is challenging everything astronomers thought they knew about how worlds form。 A “lonely” hot Jupiter — typically found without nearby companions — is sharing its system with a smaller mini-Neptune tucked even closer to the star, a setup once thought nearly impossible
Hubble has revealed a giant planet-forming disk unlike anything astronomers have seen before。 Nicknamed “Dracula’s Chivito,” the enormous structure appears turbulent and oddly lopsided, with towering filaments visible on only one side。 The disk contains enough material to potentially create multiple giant planets, making it a fascinating new labora
The Rivian Assistant is available for both Gen1 and Gen2 hardware
A medieval monk may have beaten Edmond Halley to one of astronomy’s greatest discoveries by nearly 700 years。 Researchers say Eilmer of Malmesbury recognized that the blazing comet seen in 1066 was the same one he had witnessed in 989。 At the time, comets were viewed as terrifying omens tied to war and royal deaths, adding even more drama to the fa
Scientists have taken a major step toward ultra-secure quantum communication by demonstrating a remarkably stable quantum encryption system that worked across more than 120 kilometers of optical fiber。 Using tiny semiconductor quantum dots that emit single particles of light on demand, the team achieved one of the highest secure key rates yet for t
Cumberland, B。 is reimagining its coal mining past as a clean energy opportunity。 Water trapped in abandoned mine tunnels could be used in a geothermal system to heat and cool buildings efficiently and with minimal emissions
"NASA also is defining the concept of operations for the mission
A new study suggests AI chatbots may do more than spread misinformation — they can actively strengthen a user’s false beliefs。 Because conversational AI often validates and builds on what users say, it can make distorted memories, conspiracy theories, or delusions feel more believable and emotionally real。 Researchers warn that AI companions may be
Scientists may have uncovered a surprising secret behind why life exists at all。 A new study suggests that the Universe’s fundamental constants — the deep physical rules that govern everything from atoms to stars — appear to sit within an incredibly narrow “sweet spot” that allows liquids to flow properly inside living cells。 Even tiny shifts in th
Distinct form of tooth protein in Homo erectus shows up in Denisovans—and us