共找到 20 条结果
"Computer vision and pattern recognition." , 84(9), pp. 1265–1266 Additional informationNotes on contributorsNanning Zheng Email: nnzheng@mail.xjtu.edu.cn George Loizou Email: george@dcs.bbk.ac.uk Xuguang Lan Email: xglan@aiar.xjtu.edu.cn Xuelong Li Email: xuelong_li@ieee.org
Both pattern recognition and computer vision have experienced rapid progress in the last twenty-five years. This book provides the latest advances on pattern recognition and computer vision along with their many applications. It features articles written by renowned leaders in the field while topics are presented in readable form to a wide range of readers. The book is divided into five parts: basic methods in pattern recognition, basic methods in computer vision and image processing, recognition applications, life science and human identification, and systems and technology.
This special issue covers a wide range of topics from the area of Computer Vision, Pattern Recognition, and Machine Learning. This breadth of scope is reflected by the papers included in this special issue, which touch topics including geometric Computer Vision, medical image processing, physical scene understanding, and interpretability of deep neural networks. This special issue consists of extended versions of the best papers originally presented at the 42nd German Conference on Pattern Recognition (DAGM GCPR 2020), held virtually between September 28th and October 1st, 2020.
<p>Remote photoplethysmography (rPPG) is a technique that aims to remotely estimate the heart rate of an individual using an RGB camera. Although several studies use the rPPG methodology, it is usually applied in a laboratory in a controlled environment, where both the camera and the subject are static, and the illumination is ideal for the task. However, applying rPPG in a real-life scenario is much more demanding, since dynamic illumination issues arise. The work presented in this paper introduces a framework to estimate the heart rate of an individual in real-time using an RGB camera in a situation characterized by dynamic illumination. Such situations occur, for example, when either the camera or the subject is moving, and/or the face visibility is limited. The framework uses a face detection program to extract regions of interest on an individual’s face. These regions are combined and constitute the input to a convolutional neural network, which is trained to estimate the heart rate in real-time. The method is evaluated on three publicly available datasets, and an in-house dataset specifically collected for the purpose of this study, that includes motions and dynamic illumination. The method shows good performance on all four datasets, outperforming other methods.</p>
Techniques from sparse signal representation are beginning to see significant impact in computer vision, often on nontraditional applications where the goal is not just to obtain a compact high-fidelity representation of the observed signal, but also to extract semantic information. The choice of dictionary plays a key role in bridging this gap: unconventional dictionaries consisting of, or learned from, the training samples themselves provide the key to obtaining state-of-the-art results and to attaching semantic meaning to sparse signal representations. Understanding the good performance of such unconventional dictionaries in turn demands new algorithmic and analytical techniques. This review paper highlights a few representative examples of how the interaction between sparse signal representation and computer vision can enrich both fields, and raises a number of open questions for further study.
The following topics are dealt with: object recognition; video analysis and reconstruction; tracking humans; image analysis and segmentation; structure from motion; applications and real time vision; and video recognition.
暂无摘要(点击查看原文获取完整内容)
Non-blind deblurring is an integral component of blind approaches for removing image blur due to camera shake.Even though learning-based deblurring methods exist, they have been limited to the generative case and are computationally expensive.To this date, manually-defined models are thus most widely used, though limiting the attained restoration quality.We address this gap by proposing a discriminative approach for non-blind deblurring.One key challenge is that the blur kernel in use at test time is not known in advance.To address this, we analyze existing approaches that use half-quadratic regularization.From this analysis, we derive a discriminative model cascade for image deblurring.Our cascade model consists of a Gaussian CRF at each stage, based on the recently introduced regression tree fields.We train our model by loss minimization and use synthetically generated blur kernels to generate training data.Our experiments show that the proposed approach is efficient and yields state-of-the-art restoration quality on images corrupted with synthetic and real blur.
Non-blind deblurring is an integral component of blind approaches for removing image blur due to camera shake.Even though learning-based deblurring methods exist, they have been limited to the generative case and are computationally expensive.To this date, manually-defined models are thus most widely used, though limiting the attained restoration quality.We address this gap by proposing a discriminative approach for non-blind deblurring.One key challenge is that the blur kernel in use at test time is not known in advance.To address this, we analyze existing approaches that use half-quadratic regularization.From this analysis, we derive a discriminative model cascade for image deblurring.Our cascade model consists of a Gaussian CRF at each stage, based on the recently introduced regression tree fields.We train our model by loss minimization and use synthetically generated blur kernels to generate training data.Our experiments show that the proposed approach is efficient and yields state-of-the-art restoration quality on images corrupted with synthetic and real blur.
Non-blind deblurring is an integral component of blind approaches for removing image blur due to camera shake.Even though learning-based deblurring methods exist, they have been limited to the generative case and are computationally expensive.To this date, manually-defined models are thus most widely used, though limiting the attained restoration quality.We address this gap by proposing a discriminative approach for non-blind deblurring.One key challenge is that the blur kernel in use at test time is not known in advance.To address this, we analyze existing approaches that use half-quadratic regularization.From this analysis, we derive a discriminative model cascade for image deblurring.Our cascade model consists of a Gaussian CRF at each stage, based on the recently introduced regression tree fields.We train our model by loss minimization and use synthetically generated blur kernels to generate training data.Our experiments show that the proposed approach is efficient and yields state-of-the-art restoration quality on images corrupted with synthetic and real blur.
Non-blind deblurring is an integral component of blind approaches for removing image blur due to camera shake.Even though learning-based deblurring methods exist, they have been limited to the generative case and are computationally expensive.To this date, manually-defined models are thus most widely used, though limiting the attained restoration quality.We address this gap by proposing a discriminative approach for non-blind deblurring.One key challenge is that the blur kernel in use at test time is not known in advance.To address this, we analyze existing approaches that use half-quadratic regularization.From this analysis, we derive a discriminative model cascade for image deblurring.Our cascade model consists of a Gaussian CRF at each stage, based on the recently introduced regression tree fields.We train our model by loss minimization and use synthetically generated blur kernels to generate training data.Our experiments show that the proposed approach is efficient and yields state-of-the-art restoration quality on images corrupted with synthetic and real blur.
Most recent semantic segmentation methods adopt a fully-convolutional network (FCN) with an encoderdecoder architecture.The encoder progressively reduces the spatial resolution and learns more abstract/semantic visual concepts with larger receptive fields.Since context modeling is critical for segmentation, the latest efforts have been focused on increasing the receptive field, through either dilated/atrous convolutions or inserting attention modules.However, the encoder-decoder based FCN architecture remains unchanged.In this paper, we aim to provide an alternative perspective by treating semantic segmentation as a sequence-to-sequence prediction task.Specifically, we deploy a pure transformer (i.e., without convolution and resolution reduction) to encode an image as a sequence of patches.With the global context modeled in every layer of the transformer, this encoder can be combined with a simple decoder to provide a powerful segmentation model, termed SEgmentation TRansformer (SETR).Extensive experiments show that SETR achieves new state of the art on ADE20K (50.28% mIoU), Pascal Context (55.83% mIoU) and competitive results on Cityscapes.Particularly, we achieve the first position in the highly competitive ADE20K test server leaderboard on the day of submission.
暂无摘要(点击查看原文获取完整内容)
暂无摘要(点击查看原文获取完整内容)
Conference proceedings front matter may contain various advertisements, welcome messages, committee or program information, and other miscellaneous conference information. This may in some cases also include the cover art, table of contents, copyright statements, title-page or half title-pages, blank pages, venue maps or other general information relating to the conference that was part of the original conference proceedings.
暂无摘要(点击查看原文获取完整内容)
暂无摘要(点击查看原文获取完整内容)
暂无摘要(点击查看原文获取完整内容)
暂无摘要(点击查看原文获取完整内容)