CVPR 2017 Tutorial: Dealing with Reality: Low-Quality Visual Data Processing and Analytics

Time

07/21/2017 afternoon

Location

Hawaii Convention Center, Honolulu, Hawaii. (in conjunction with IEEE CVPR 2017)

Speakers

Description

While many sophisticated models are developed for visual information processing, very few pay attention to their usability in the presence of (heavy) data quality degradations. Most successful models are trained and evaluated on high quality visual datasets. On the other hand, the data source often cannot be assured of high quality in practical scenarios. For example, video surveillance systems have to rely on cameras of very limited definitions, due to the prohibitive costs of installing high-definition cameras all around, leading to the practical need to recognize objects reliably from very low resolution images. Other quality factors, such as occlusion, motion blur, missing data and label ambiguity, are also ubiquitous in the wild.

The tutorial will present a comprehensive and in-depth review, on the recent advances in the robust sensing, processing and understanding of low-quality visual data. First, we will introduce how the image/video restoration models (e.g., denoising, deblurring, super-resolution) can be enhanced, by incorporating various problem structures and priors. Next, we will show how the image/video restoration and the visual recognition could be jointly optimized. Such an end-to-end optimization consistently achieves the superior performance over the traditional multi-stage pipelines. We will also demonstrate how the above discussed approaches benefit real-world applications, such as face recognition, video surveillance, and license plate recognition. Furthermore, we will address an increasingly important issue in using big visual data for machine learning, where the available dataset does not contain high-quality labels and thus weak, noisy or otherwise low-quality labels need to be exploited intelligently for desired outcome.

As the low data quality appears to be the bottleneck for numerous applications, such as visual recognition, object tracking, medical image processing and 3D vision, our proposed tutorial is expected to be of broad interests to the CVPR community.

Syllabus

Talk I: Image and Video Restorations with Structural Priors

Speakers

Abstract

TBD

Slides

TBD

Related Papers

1. H. Zhang, J. Yang, Y. Zhang, N. M. Nasrabadi, T. S. Huang, “Close the loop: Joint blind image restoration and recognition with sparse representation prior”, In Proceedings of International Conference on Computer Vision (ICCV) 2011.
2. H. Zhang, D. Wipf, “Non-Uniform Camera Shake Removal Using a Spatially-Adaptive Sparse Penalty”, Advances in Neural Information and Processing Systems (NIPS) 2013.
3. H. Zhang, D. Wipf, Y. Zhang, “Multi-Observation Blind Deconvolution with an Adaptive Sparse Prior”, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). 36(8): 1628- 1643 (2014).
4. H. Zhang, L. Carin, “Multi-shot Imaging: Joint Alignment, Deblurring, and Resolution Enhancement”, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
5. D. Wipf, H. Zhang, “Revisiting Bayesian blind deconvolution”, Journal of Machine Learning Research (JMLR) 15(1): 3595-3634 (2014).

Talk II: Recognition from Very Low-Quality Images and Videos using Deep Networks

Speakers

Abstract

TBD

Slides

TBD

Related Papers

1. Z. Wang, S. Chang, Y. Yang, D. Liu and T. Huang, "Studying Very Low Resolution Recognition Using Deep Networks", In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
2. Z. Wang, Y. Yang, Z. Wang, S. Chang, J. Yang, and T. Huang, “Learning Super-Resolution Jointly from External and Internal Examples”, IEEE Transactions on Image Processing (TIP), vol. 24, no. 11, pp. 4359-4371, Nov. 2015.
3. Z. Wang, Z. Wang, S. Chang, J. Yang and T. Huang, “A Joint Perspective Towards Image Super-Resolution: Unifying External- and Self-Examples”, In Proceedings of IEEE Winter conference on Applications of Computer Vision (WACV), 2014.
4. Z. Wang, H. Li, Q. Ling, and W. Li, “Robust Temporal-Spatial Decomposition and Its Applications in Video Processing”, IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), vol. 23, no. 3, pp. 387-400, Mar. 2013.

Talk III: Learning with Noisy Visual Big Data: Model and Applications

Speakers

Abstract

TBD

Slides

TBD

Related Papers

1. Quanzeng You, Jiebo Luo, Hailin Jin, and Jianchao Yang, "Robust Image Sentiment Analysis using Progressively Trained and Domain Transferred Deep Networks," The Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI), Austin, TX, January 25-30, 2015.
2. Quanzeng You, Hailin Jin, Jianchao Yang, Jiebo Luo, "Building a Large-Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark," The 30th AAAI Conference on Artificial Intelligence (AAAI), Phoenix, AZ, January 2016.
3. Yuncheng Li, Yale Song, Liangliang Cao, Joel Tetreault, Larry Goldberg, Jiebo Luo, “TGIF: A NewDataset and Benchmark on Animated GIF Description,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, June 2016. (Spotlight)
4. Jingen Liu, Jiebo Luo, Mubarak Shah, "Recognizing Realistic Actions from Videos in the Wild," IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, FL, June 2009. (Oral Presentation)
5. Lixin Duan, Dong Xu, Ivor Tsang, Jiebo Luo, “Visual Event Recognition in Videos by Learning from Web Data,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, June 2010. (Best Student Paper)
6. Yiming Liu, Dong Xu, Ivor Tsang, Jiebo Luo, "Textual query of personal photos facilitated by large-scale web data," IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 33(5): 1022-1036, May 2011.