The era of Big Data, in which petabytes of information are accumulating at an accelerating rate calls to the massive use of techniques to manage (store, index, shard, duplicate), query, and analyse them. Processing billions of web pages, photos, log entries calls for the development of new tools and the proposal of new programming paradigms.
Cloud computing is emerging as a relatively new approach for dealing and facilitating unlimited access to computing and storage resources for building applications. The very basic principle of cloud computing is to assume that applications, accessible through a network, are built upon a service oriented infrastructure dedicated to provide them with the necessary (not more not less) computing, storage and network resources. Instead of having one for all computer or server, the computing context is configured according to the characteristics of the application. Instead of buying one computer/server resources are provided (and bought) on demand.
This course will focus on data integration and management on cloud service oriented architectures. Therefore, the course will briefly introduce cloud computing fundamental concepts and will address data and services management on Cloud using practical examples based on cloud existing environments and execution models like (i) ETL and federation tools; (ii) Mapreduce and its implementation Hadoop, the most prominent open-source ecosystem of tools for working with exciting new large-scale datasets and, (iii) “No SQL databases”.
The course is built on the idea of getting students to understand the various aspects of data and services management throug a problem-solution approach. Thus, we wiil propose problems that might be encountered in the development and deployment of data-centric applications (with large collections of data and services) within a Cloud and we will guide students for proposing and programming solutions using specific tools.
Courses "Principes des SGBD", "Clefs pour l'administration des SGBD relationnels et Objet" and "Distributed Databases"
Examen ou exposé et/ou travail pratique (projet personnel)
Bibliographie / textbooks :
[ 1 ] http://deoracle.org/online-pedagogy/teaching-strategies/applying-cloud-computing.html
[ 2 ] http://blogs.msdn.com/b/brunoterkaly/archive/2010/10/05/how-to-teach-cloud-computing-the-windows-azure-platform-step-1.aspx
[ 3 ] http://sites.google.com/site/freeonlineteachingtools/cloud
[ 4 ] http://aws.amazon.com/education/
[ 5 ] http://www.umiacs.umd.edu/~jimmylin/cloud-2008-Fall/index.html
[ 6 ] www.windowsazure.com/fr
[ 7 ] NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence by Pramod J. Sadalage, Martin Fowler
[ 8 ] Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement, by: Eric Redmond and Jim R. Wilson
[ 9 ] Thanks to J. Ullmann, P. Valduriez, Cl. Roncancio, Ch. Bobineau, JL. Zechinelli, R. Lozano for the slides provided
Divers articles et notes de cours