搜索 — ResearchTracker

Recent progress in VLMs has demonstrated impressive capabilities across a variety of tasks in the natural image domain. Motivated by these advancements, the remote sensing community has begun to adopt VLMs for remote sensing vision-language tasks, including scene understanding, image captioning, and visual question answering. However, existing remote sensing VLMs typically rely on closed-set scene understanding and focus on generic scene descriptions, yet lack the ability to incorporate external knowledge. This limitation hinders their capacity for semantic reasoning over complex or context-dependent queries that involve domain-specific or world knowledge. To address these challenges, we first introduced a multimodal Remote Sensing World Knowledge (RSWK) dataset, which comprises high-resolution satellite imagery and detailed textual descriptions for 14,141 well-known landmarks from 175 countries, integrating both remote sensing domain knowledge and broader world knowledge. Building upon this dataset, we proposed a novel Remote Sensing Retrieval-Augmented Generation (RS-RAG) framework, which consists of two key components. The Multi-Modal Knowledge Vector Database Construction modul

Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models

arXiv2024-01-17作者：Haonan Guo, Xin Su, Chen Wu

Recently, the flourishing large language models(LLM), especially ChatGPT, have shown exceptional performance in language understanding, reasoning, and interaction, attracting users and researchers from multiple fields and domains. Although LLMs have shown great capacity to perform human-like task accomplishment in natural language and natural image, their potential in handling remote sensing interpretation tasks has not yet been fully explored. Moreover, the lack of automation in remote sensing task planning hinders the accessibility of remote sensing interpretation techniques, especially to non-remote sensing experts from multiple research fields. To this end, we present Remote Sensing ChatGPT, an LLM-powered agent that utilizes ChatGPT to connect various AI-based remote sensing models to solve complicated interpretation tasks. More specifically, given a user request and a remote sensing image, we utilized ChatGPT to understand user requests, perform task planning according to the tasks' functions, execute each subtask iteratively, and generate the final response according to the output of each subtask. Considering that LLM is trained with natural language and is not capable of di

搜索结果：Remote

Remote Sensing Retrieval-Augmented Generation: Bridging Remote Sensing Imagery and Comprehensive Knowledge with a Multi-Modal Dataset and Retrieval-Augmented Generation Model

Remote Sensing ChatGPT: Solving Remote Sensing Tasks with ChatGPT and Visual Models

Survey on Disaster Management Datasets for Remote Sensing Based Emergency Applications

STAR-IOD: Scale-decoupled Topology Alignment with Pseudo-label Refinement for Remote Sensing Incremental Object Detection

Vision-Language Modeling Meets Remote Sensing: Models, Datasets and Perspectives

SAM2-ELNet: Label Enhancement and Automatic Annotation for Remote Sensing Segmentation

Visual and Text Prompt Segmentation: A Novel Multi-Model Framework for Remote Sensing

Design and Development of a Remotely Wire-Driven Walking Robot

Efficient and Robust Remote Sensing Image Denoising Using Randomized Approximation of Geodesics' Gramian on the Manifold Underlying the Patch Space

TimeSenCLIP: A Time Series Vision-Language Model for Remote Sensing

Towards Remote Sensing Change Detection with Neural Memory

Real-Time Oriented Object Detection Transformer in Remote Sensing Images

ImageRAG: Enhancing Ultra High Resolution Remote Sensing Imagery Analysis with ImageRAG

TinyRS-R1: Compact Multimodal Language Model for Remote Sensing

Advancing Image Super-resolution Techniques in Remote Sensing: A Comprehensive Survey

Remote Implementation of Hidden or Partially Unknown Quantum Operators using Optimal Resources: A Generalized View

Unsupervised Stereo Matching Network For VHR Remote Sensing Images Based On Error Prediction

A Novel Scene Coupling Semantic Mask Network for Remote Sensing Image Segmentation

Brain-Inspired Online Adaptation for Remote Sensing with Spiking Neural Network

Composed Image Retrieval for Remote Sensing