Number of hours
- Lectures 18.0
Media in all its forms (such as photos, videos, music) has been a core component of our lives for several decades. Its impact has increased with the rapid growth of websites, and in particular social networking sites. For example, over 300 hours of video are uploaded to YouTube every single minute. This deluge of data is expected to further increase, with some estimates predicted that over 80% of internet traffic will be due to visual content in 2020. In this context, dealing with problems in the multimedia domain is more important than ever. This course provides an introduction to techniques handling such databases and problems arising from accessing them.
The course is taught by two researchers working in the areas of computer vision and machine learning from Inria Grenoble (Karteek Alahari) and Xerox Research (Diane Larlus). The two instructors will bring their expertise in academic as well as industrial research to discuss the material, along with real-world applications and also present the latest developments in topics related to: image and video retrieval, large-scale object recognition, action recognition, big data related problems in computer vision.Contact Karteek ALAHARI
In particular, the course will present the following topics:
- Representation and storage of visual and audio data
- Local and global descriptors
- Indexation of visual and audio data
- Similarity measures for comparing data
- Retrieval systems, Data mining
- Support vector machines
- Deep learning (Convolutional Neural Networks)
- Recognition, Classification problems in Images and Video
Basic knowledge in statistics, computer science (data structure and algorithms), and image processing.
Weekly quiz on papers presented by students in the class.
Final written exam.
Y. Lecun, Cours au College de France, http://www.college-de-france.fr/site/yann-lecun/course-2015-2016.htm
H. Wang, A. Kläser, C. Schmid, L. C.-Lin, Action Recognition by Dense Trajectories, CVPR 2011
H. Jégou, M. Douze, C. Schmid, Hamming embedding and weak geometric consistency for large scale image search, ECCV 2008
G. Csurka, C Dance, L Fan, J Willamowski, C Bray, Visual categorization with bags of keypoints, ECCV Workshop 2004
J. Sivic and A. Zisserman, Video Google: A Text Retrieval Approach to Object Matching in Videos, ICCV 2003
D. Lowe, Distinctive image features from scale-invariant keypoints, IJCV 2004