My Research Programme
Many real world vision tasks such as motion segmentation, large scale scene reconstruction, and action/scene classification remain challenging research problems. This is not surprising, considering that more than half of the neocortex is involved in visual processing. To solve these vision problems robustly, we envisage that it requires a concerted research effort in integrating the different vision modules together, as well as further advancing our fundamental understanding of the individual vision modules.
I am interested to investigate the feedback and lateral links that exist in complex vision problems. For instance, the segmentation problem will need to combine a broad range of technical advances in computation of 3D surfaces, knowledge about natural scene statistics and gestalt laws, and expertise in advanced mathematical techniques such as level set, graph cut, MRF-based learning. It provides an opportunity for psychophysical, computational and mathematical studies that will extend our understanding of the processes underlying the interactions between various vision modules, as well as the incorporation of environmental constraints to enhance the performance in dynamic, real-world situations.
In the past decade, my research has revolved around the central question of space perception arising from motion cues, otherwise known as the structure from motion (SFM) problem. I approach this problem mainly from the computational perspective but also study the psychophysical implication and some applied aspects (see the links below). It remains an active area of my research. Recently, I address various motion-related problems such as 3D motion segmentation, tracking, as well as change detection amidst scenes with complicated dynamic behavior such as swaying trees and undulating waves.
For prospective research students, I am looking for someone who is really keen to understand the visual processes involved in human vision, and feels excited to build a robust system that can function in the real world. Preferably, the student must have an adequate level of mathematical sophistication, as the field of computer vision is currently going through a crucial mutation, requiring more and more mathematical skills such as PDE, differential geometry, functional analysis, etc.. Interested students with EE, CS, or applied mathematics background are welcome to contact me.
For research staff, I am now looking for several research fellows in the following two areas: 1) Visibility Enhancement and Robust Motion Analysis in Extreme Outdoor Conditions, including rain, dust, haze, as well as under night-time conditions. 2) Dynamic Vision for Actions, involving topics such as multiple target tracking, motion segmentation, semantic labeling of 3D scenes for actions such as driving, as well as the more abstract problems of labelling, assignment, clustering, and model selection. The positions are funded for 3 year, and are available from April 2016 onwards. Salary is competitive. Please send a detailed CV with publication list, a concise description of research interests and future plans, and academic transcripts to my email account.
Simultaneous Clustering and Model Selection for Tensor Affinities
Zhuwen Li, Yang Shuoguang, Loong-Fah Cheong, Kim Chuan Toh.
IEEE Conference on Computer Vision and Patten Recognition (CVPR), 2016 (Spotlight presentation (9.7% acceptance rate))
[PDF ][Supplementary PDF][Codes download]
- Given affinity matrix/tensor, perform model selection and clustering. Model selection is seldom addressed in the field.
- Translate the rank concept and the notion of positive-semi-definiteness to the higher dimensional tensor setting in a form that is tractable.
- Avoid having to construct the affinity tensor completely and instead efficiently solve via stochastic optimization in an online fashion.
Actionness-assisted Recognition of Actions
Ye Luo, Loong-Fah Cheong, and An Tran.
IEEE International Conference on Computer Vision (ICCV), 2015
[PDF (1.5MB)][Project page and codes download]
- Extract low-level actionness attributes from videos that can reveal agency and intentionality of action, independent of the action type.
- Can be used for action detection, and for implementing an actionness-driven pooling scheme to improve action recognition performance.
- Perform well in actions involving interaction, because the method groups the interacting humans or objects into a unit via temporal synchrony.
Simultaneous Video Defogging and Stereo Reconstruction
Zhuwen Li, Ping Tan, Robby T. Tan, Danping Zou, Steven Zhiying Zhou and Loong-Fah Cheong.
IEEE Conference on Computer Vision and Patten Recognition (CVPR), 2015 (Oral)
[PDF (9MB)][Presentation slides][Matlab Code] [Dataset][Results]
- Jointly estimate scene depth and recover the clear latent image from a foggy video sequence.
Practical Matrix Completion and Corruption Recovery using Proximal Alternating Robust Subspace Minimization
Yu-Xiang Wang, Choon Meng Lee, Loong-Fah Cheong, Kim-Chuan Toh.
International Journal of Computer Vision, 2014
[PDF (6MB)][Matlab Code]
- Can handle high % of missing data, non-random support, noise and gross corruptions.
Block-sparse RPCA for Salient Motion Detection
Zhi Gao, Loong-Fah Cheong, and Yu-Xiang Wang.
IEEE Transaction on Pattern Analysis and Machine Intelligence, 2014
[PDF (1.6MB)][Matlab Code ][Video Demo 1 (69MB) ][Video Demo 2 (21MB) ]
- State-of-the-art results in 2 recent change detection benchmarks
- Can handle illumination change, bad weather, background motion, camera jitter, disparate scale
Perspective Motion Segmentation via Collaborative Clustering
Zhuwen Li, Jiaming Guo, Loong-Fah Cheong, and Steven Zhiying Zhou.
IEEE International Conference on Computer Vision (ICCV), 2013 (Oral)
[PDF (2MB)][62-clip Dataset Download (84 MB)]
- Can handle perspective effects, missing data, model selection, and yet retain elegance of formulation.
- State-of-the-art results in handling the preceding challenges
Consistent Foreground Co-segmentation
Jiaming Guo, Loong-Fah Cheong, Robby T. Tan and Steven Zhiying Zhou.
Asian Conference on Computer Vision (ACCV), 2014
[PDF (9MB)][Project page and CFViCS database download]
- Can handle foreground with variegated appearance, moving nonrigidly or consisting of multiple interacting entities (e.g. a mating pair of birds).
- Can handle cluttered background and remove extraneous objects momentarily moving together.
Quasi-Parallax for Nearly Parallel Frontal Eyes --a possible role of binocular overlap during rapid locomotion
Loong-Fah Cheong, Zhi Gao.
International Journal of Computer Vision, 2013
- We look at whether binocular overlap is necessarily leveraged in terms of stereoscopic depth recovery.
Geometry of Distorted Visual Space . Int'l Journal of Computer Vision, 32(3), pp 195-212, 1999. ?1999 by Kluwer academic
Error in Depth Reconstruction . Int'l Journal of Computer Vision, 44(3), pp 199-217, Aug 2001. ?2001 by Kluwer academic
Behaviour of SFM algorithms . Int'l Journal of Computer Vision, 51(2), 111-137, 2003. ?2003 Kluwer academic
Depth distortion under calibration uncertainty, Computer Vision and Image Understanding, Volume 93, Issue 3 , March 2004, Pages 221-244.
How do we Perceive Depths from Motion Cues in the Movies: A Computational Account, Journal of the Optical Society of America A: Optics, Image Science, and Vision, 2008.
Linear Quasi-Parallax SfM using Laterally-placed Eyes, International Journal of Computer Vision, Volume 84, Number 1 / August, 2009, pg 21-39.
When Discrete Meets Differential ?Assessing the Stability of Structure from Small Motion, International Journal of Computer Vision, vol. 86, nos 1, pp. 87-110, 2010.
Error Characteristics of SFM with Erroneous Focal Length, Computer Vision and Image Understanding, 115, No.1, (Jan 2011) 16?0.
Smoothly Varying Affine Stitching, CVPR 2011, Jun 20-25, Oral presentation
Active Visual Segmentation,?IEEE Transaction on Pattern Analysis and Machine Intelligence (TPAMI), Vol 34, No. 2, p639-653, April 2012.
Simultaneous Camera Pose and Correspondence Estimation with Motion Coherence,?International Journal of Computer Vision, 96(2): 145-161 2012
Quasi-Parallax for Nearly Parallel Frontal Eyes --a possible role of binocular overlap during rapid locomotion, accepted for publication in International Journal of Computer Vision, June 2012.
Perspective Motion Segmentation via Collaborative Clustering, ICCV2013. Oral presentation
Actionness-assisted Recognition of Actions, ICCV2015.
Slant and Tilt Perception: A computational and psychophysical study . “Lecture Notes in Computer Science?Vol 1843, 2000. ?2000 by Springer-Verlag
Absolute distance perception during in-depth head movement: Calibrating optic flow with extra-retinal information. Vision Research, 42(16), pp. 1991-2003, 2002.
The visual perception of plane tilt from motion in small field and large field: psychophysics and theory, Vision Research, Volume 46, Issue 20, October 2006, Pages 3494-3513.
Establishment Shot Detection Using Qualitative Motion. IEEE Conference on Computer Vision and Pattern Recognition, June 18 - 20, 2003, Madison, Wisconsin, Volume II, p. 85-90.
Framework for synthesizing semantic-level indexes. Multimedia Tools and Applications 20(2): 135-158; Jun 2003.
Synergizing Spatial and Temporal Texture. IEEE Transactions on Image Processing 11(10), pp. 1179-1191, 2002.
Addressing the problems of Bayesian Network classification, IEEE Transactions on Knowledge and Data Engineering, Volume 16 , Issue 2, February 2004, Pages: 230 ? 244.
Affective Understanding in Film , IEEE Transactions on Circuits and Systems for Video Technology, Volume: 16 Issue: 6 June 2006. Page(s): 689- 704.
A Taxonomy of Directing Semantics for Film Shot Classification, IEEE Transactions on Circuits and Systems for Video Technology, Volume 19, No. 10, October 2009, pp. 1529-1542.
?/span> Also leading a seminar class EE6733 Advanced Topics on Vision and Machine Learning.