Ensimag Rubrique Formation 2022

Software infrastructure for data centers & Cloud computing - WMM9MO54

  • Number of hours

    • Lectures 18.0
    • Projects -
    • Tutorials -
    • Internship -
    • Laboratory works -
    • Written tests -

    ECTS

    ECTS 3.0

Goal(s)

The objectives of the course are to:

  • understand the main challenges involved when using and operating Cloud infrastructures,
  • know the internal principles of Cloud infrastructure services
  • be able to design Cloud-based applications.

Responsible(s)

Renaud LACHAIZE, Thomas ROPARS

Content(s)

Modern Web applications, popular online services (e.g., search engines, social networks, streaming services) and Big Data applications share some major requirements: they need a large amount of computing resources and have stringent constraints in terms of reliability, availability and performance. To fulfill such requirements, these systems are implemented using a large number of servers hosted in a data center, forming so-called “rack-scale” or even “warehouse-scale” platforms.

At the core of the success of companies like Google, Facebook, Twitter or Amazon, is the ability to exploit data center resources efficiently and reliably through well-designed software infrastructures. And while a few challenges are specific to the massive size of the giant companies mentioned above, most design principles and research and development works on such software infrastructures are also of interest for smaller scale systems.

This course aims at studying the design of software infrastructures for data center systems. It introduces some of the main building blocks and abstraction levels of such infrastructures. The following topics will be covered:

  • An overview of the Cloud computing landscape including (i) the basic facilities to deploy applications (virtual machines, containers, functions) and data (block storage, object storage, file storage, database storage), and (ii) the characteristics of “Cloud-native” applications
  • Resource management services for the resource allocation, placement, scheduling, supervision and orchestration of distributed applications (for example, the Kubernetes system)
  • Coordination and communication services (for example, etcd and ZooKeeper) allowing to build consistent and highly available applications despite failures and churn communication services (for example, Kafka) allowing to interconnect and integrate various applications acting as producers and consumers of data streams
  • Data processing and storage services including in-memory data storages used to increase applications throughput (for example, memcached)
  • The impact of new hardware trends an overview of recent progresses in the hardware design of computer systems (e.g., speed improvements, and evolutions of the hardware/software, specialized processing units) and the consequences for Cloud infrastructures and applications.

Through this course, students will learn about the design of these services and frameworks, and get the chance to understand the underlying theoretical and practical challenges related to operating systems and distributed systems (including scalability, fault tolerance, data consistency and resource virtualization).

The course is organized into several types of activities: lectures and case studies, lab sessions (mini-projects), study and presentation of influential/recent research papers.

Prerequisites

Basic knowledge (M1 level) of operating systems and networks

Test

The evaluation will be based on mini-projects and/or presentations of research papers, and on a written exam.

N1 = (0.66 * E1 + 0.34 * CC)
N2 = (0.66 * E2 + 0.34 * CC)

La note de contrôle continu (CC) est évaluée sur la base de mini-projets et/ou de présentation d'articles.

The exam is given in english only FR

Calendar

The course exists in the following branches:

  • Curriculum - Master 2 in Computer Science - Semester 9 (this course is given in english only EN)
see the course schedule for 2020-2021

Additional Information

Course ID : WMM9MO54
Course language(s): FR

You can find this course among all other courses.

Bibliography

  • Kris Nova and Justin Garrison. Cloud Native Infrastructure. O’Reilly, 2017.
  • Brendan Burns. Designing Distributed Systems. O’Reilly, 2018.
  • Martin Kleppmann. Designing Data-Intensive Applications. O’Reilly, 2016.
  • Luiz Andre? Barroso, Urs Ho?lzle, and Parthasarathy Ranganathan. The Datacenter as a Computer. Designing Warehouse-Scale Machines (3rd edition). Morgan & Claypool, 2018.