Distributed Systems for Data Management - 5MMSDTD7
A+Augmenter la taille du texteA-Réduire la taille du texteImprimer le documentEnvoyer cette page par mailPartagez cet articleFacebookTwitterLinked In
Number of hours
Lectures : -
Tutorials : -
Laboratory works : -
Projects : 18.0
Internship : -
Written tests : -
ECTS : 2.0
Officials :Thomas ROPARS
The goal of this project is to design and automatically deploy a distributed data processing application. The application will be based on the main frameworks used in the Big Data community. The application will be automatically deployed in a public Cloud infrastructure.
The students will work in teams of 5 students.
The students will build a distributed data processing system. These systems are very often used today in different domains (analysis of the stock market, analysis of sensors data, analysis of data coming from tracking systems, etc.). The students will be free to pick the domain targeted by their application.
A data processing system includes several components, each of them being distributed over several machines:
A data ingestion component
A data storage component
One or several data processing components
A visualization component
For this project, the students will use the standard technologies that are used by the main companies in the domain (Google, Facebook, LinkedIn, etc.). For example, the students could use:
Kafka or Samza for data ingestion
Spark or Flink for data processing
Cassandra, MongoDB or InfluxDB for storing data
Furthermore, the students will have to set up the software infrastructure that will allow to configure, deploy and automatically reconfigure their application to be able to execute it on a Cloud computing platform (Ex: AWS, Azure, etc.). The tools used for this stage could include:
Resource provisioning and configuration tools (ex: Ansible)
Software configuration and deployment tools (ex: Docker)
Orchestration tools (ex: Kubernetes)
Networks, distributed systems, databases.
Demo of the running application. Report and documentation.
N1=P pas de rattrapage
The course exists in the following branches:
Curriculum - Information Systems Engineering - Semester 9