Parcours DS

Liste des UE (toutes à 2.5 ECTS)

[DS] Bases de données avancées I : Optimisation (M1)
Coordinator : Nicole Bidoit
L’objectif de ce cours est de dévoiler les mécanismes internes d’un SGBD (représentation physique, évaluation de requêtes). L’accent est mis sur le moteur de requêtes des SGBDs et l’optimisation des applications. Ces connaissances sont nécessaires autant aux développeurs d’application qu’aux administrateurs de base de données, par exemple pour la mise en place et la maintenance d’applications. La mise en pratique des concepts présentés se fera en utilisant les fonctionnalités des SGBDs (PostgreSQL ou ORACLE) permettant de scruter et d’ajuster différents paramètres du système. Le contenu du cours est le suivant : 1. Introduction 2. Gestion de la mémoire et stockage, index de hachage, index B, 3. Implémentation de SQL via les opérateurs relationnels, 4. Plan d’exécution
Prerequisite : Des notions élémentaires en bases de données relationnelles (conception de schémas, programmation SQL et fondement de SQL) sont nécessaires. Il est souhaitable mais non obligatoire d’avoir une connaissance de l’intégration de SQL avec un langage de programmation comme le C.
Language : Francais
[DS] Bases de données avancées II : Transactions (M1)
Coordinator : Nicole Bidoit
L’objectif de ce cours est de dévoiler les mécanismes internes d’un SGBD permettant d’assurer la fiabilité et la qualité des données. L’accent est mis sur la gestion des transactions pour la concurrence d’accès et pour la reprise sur panne et d’aborder l’optimisation (tuning) des transactions pour les applications. Ces connaissances sont nécessaires autant aux développeurs d’application qu’aux administrateurs de base de données, par exemple pour la mise en place et la maintenance d’applications. La mise en pratique des concepts présentés se fera en utilisant les fonctionnalités des SGBDs (PostgreSQL ou ORACLE) permettant de scruter et d’ajuster différents paramètres du système. Le contenu du cours est le suivant : 1. Transactions 2. Mécanismes et protocoles de concurrence 3. Journalisation et reprise sur panne 4. Maintenance d’applications et tuning des transactions
Prerequisite : Des notions élémentaires en bases de données relationnelles (conception de schémas, programmation SQL et fondement de SQL) sont nécessaires. Il est souhaitable mais non obligatoire d’avoir une connaissance de l’intégration de SQL avec un langage de programmation comme le C.
Langue : Francais
[DS] Intelligence Artificielle, Logique et Contraintes I (M1)
Coordinator : Philippe Chatalic
Ce cours est une introduction aux principes de formalisation et de de résolution de problèmes basés sur les méthodes de satisfaction (et/ou d'optimisation) de contraintes logiques, dont le champs d'application est très large. Il décrit les principes fondamentaux et les méthodes génériques que l'on retrouve au cœur des solveurs de problèmes de ce type, afin de pouvoir de familiariser avec leur utilisation. Le cours introduit le concept de problème de satisfaction de contraintes et donne quelques principes pouvant guider la formalisation de tels problèmes en les illustrant sur des cas concrets. L'accent est mis sur le cas des problèmes à domaines finis. Il présente différentes notions de cohérence locale (cohérence de sommet, d'arc,...) ainsi que des algorithmes permettant de les établir, afin de pouvoir transformer automatiquement de tels problèmes en des problèmes plus simples. Il présente également les méthodes de recherche exhaustives permettant de trouver, énumérer les solutions de tels problèmes. L'intégration de méthodes de cohérence locale dans les méthodes de recherche exhaustive conduit à des méthodes de recherche hybrides dont les principes se retrouvent au cœur de la plupart des solveurs actuels.
Prerequisite :
Language : Francais
[DS] Intelligence Artificielle, Logique et Contraintes II (M1)
Coordinator : Philippe Chatalic
Ce cours permet de se familiariser avec l'utilisation pratique de solveurs génériques de contraintes. S'appuyant sur un solveur particulier, dont le langage est présenté en détail, il illustre son utilisation pour modéliser et résoudre des problèmes de complexité croissante. L'objectif est d'acquérir suffisamment d'autonomie pour pouvoir aborder la résolution d'un projet plus complexe. Le cours permet illustrer de façon pratique, l'intérêt d'utiliser des solveurs génériques de problèmes de satisfaction/optimisation de contraintes. Il présente les caractéristiques (langage, outils de contrôle) d'un solveur de contraintes particulier tout en les illustrant sur différents problèmes concrets. Le cours est aussi l'occasion d'aborder quelques notions complémentaires liées à la résolution de tels problèmes, comme le paramétrage d'heuristiques, l'utilisation de contraintes globales et la modélisation de préférences.
Prerequisite : Intelligence Artificielle, Logique et Contraintes I
Language : Francais
[DS] Distributed Systems for Massive Data Management (M1)
Coordinator : Benoît Groz, Silviu Maniu
Ce cours donne un panorama des différents systèmes de gestion de données distribués, et les concepts mis en œuvre dans ces systèmes. Le cours abordera les points suivants : partitionnement, indexation, réplication, panorama des systèmes NoSQL, quelques éléments d'architecture et systèmes et structures de données et algorithmes utilisés dans ces systèmes. Le cours abordera ces notions à travers de nombreux systèmes NoSQL, et en comparant ces systèmes avec les technologies relationnelles similaires.
Prerequisite : Bases de données avancées
Language : English
[AI] Foundation Principles of Machine Learning (M1)
Coordinators : François Landes, Michele Sebag
[AI] Large-Scale Distributed Data Processing (M1)
Coordinators : Benoît Groz, Silviu Maniu
[DS] Algorithms for Data Science (M2)
Coordinator : Silviu Maniu
The course focuses on the algorithms involved in data related tasks, collectively grouped under the concept of "data mining". Data mining is a set of algorithms for transforming, modelling, and interpreting data that can be directly applied to Data Science tasks, or can be necessary as pre-processing step, before the data can be presented to an, e.g., machine learning task. Lectures will cover a wide range of concepts such as: - data mining algorithms: finding similar items (e.g., LSH), finding frequent items, dimensionality reduction techniques - applications to advertising and recommendation on the Web, such as collaborative filtering - data stream mining: applying data mining algorithms for cases where the data is presented at a fast pace, and memory is insufficient to store (or access) the entirety of the dataset.
Prerequisite : Algorithms, Programming (Python/C/Java)
Language : English
[DS] Social and Graph Data Management (M2)
Coordinator : Silviu Maniu
The course will teach students the basics of social and graph data management, and is organized in two parts. The first part will study graph metrics (degree distributions, clustering coefficients, distance metrics, etc.) with an objective to apply them to the analysis of real-graph data, especially social network graphs -- as found in Web application such as Facebook or Twitter -- and establish what makes them special compared to standard, random, graphs. The second part of the course will focus on graph algorithms, as used for graph data analysis (PageRank, probabilistic reachability analysis), and apply them to a variety of applications such as link analysis, influence maximization, or link prediction. This part of the course is focused more on the practical aspect, and will be augmented by practical applications where the concepts will be applied. The course will present well-known systems for graph databases, such as Neo4J. - applications to advertising and recommendation on the Web, such as collaborative filtering - data stream mining: applying data mining algorithms for cases where the data is presented at a fast pace, and memory is insufficient to store (or access) the entirety of the dataset.
Prerequisite : Advanced databases, Algorithms, Programming (Python/C/Java)
Language : English
[DS] Semantic Web and Ontologies (M2)
Coordinator : Yue Ma
Searching information over rich web resources becomes a necessity for a large number of advanced applications. However, there are several impediments to use traditional keyword based search in practice due to the semantic mismatch among different resources. The course will introduce an approach to handle this problem, so called Semantic Web technology, and then it will focus on the knowledge representation aspect of Semantic Web, from W3C Semantic Web standards such as RDF, SPARQL and OWL to various representation formalisms (Description Logics) and their reasoning mechanisms.
Prerequisite : Programming (Java), Propositional Logic
Language : English
[DS] Knowledge Discovery in Graph Data (M2)
Coordinator : Fatiha Saïs
Today, we are experiencing an unprecedented production of resources, published as Linked Open Data (LOD, for short). This is leading to the creation of knowledge graphs (KGs) containing billions of RDF (Resource Description Framework) triples, such as DBpedia, YAGO and Wikidata on the academic side, and the Google Knowledge Graph or eBay Knowledge Graph on the commercial side. They contain knowledge that is typically expressed in RDF, i.e., as statements of the form . Sometimes, the various types and relations are represented in an OWL2 (Web Ontology Language) ontology, which defines their interrelations and axioms such as, subsumption, disjunction and functionality of properties. However, the existing KGs are far from being complete and consistant. Hence, different methods are needed to be developed on top of these existing KGs. In one hand, methods that aim to expanding and enriching KGs, in the other hand, methods addressing the problem of validating the content of the KGs. In this course we will focus on the identity problem which consists in finding and validating identity links between resources and knowledge discovery problem (e.g. key axioms, logical rules) from RDF data. This course will also reserve place to some feedback from applications using knowledge graphs and ontologies such as bio-informatics, agronomy and IoT.
Prerequisite : Semantic Web and Ontologies
Language : English
[DS] Web of Data (M2)
Coordinator : Fatiha Saïs
Today, we are experiencing an unprecedented production of resources, published as Linked Open Data (LOD, for short). This is leading to the creation of knowledge graphs (KGs) that form "the Web of Data" containing billions of RDF (Resource Description Framework) triples. It contains cross-domain knowledge graphs such as DBpedia, YAGO and Wikidata and domain-specific knowledge graphs such as Bio2RDF and eBay Knowledge Graph. They contain knowledge that is typically expressed in RDF, i.e., as statements of the form (Macron, presidentOf, France. Sometimes, the various types and relations are represented in an OWL2 (Web Ontology Language) ontology, which defines their relationships and axioms such as, subsumption, disjunction and functionality of properties. In this course you will learn the principles of building a Web of data, an overview of the important problems that occur when one aims at enriching and validating the content the available knowledge graphs that are far from being complete and consistant. We will focus on the identity problem which consists in finding and validating identity links between resources and the ontology matching problem. This course will also reserve place to some feedback from applications using knowledge graphs and ontologies such as biology, agronomy and digital humanities.
Prerequisite : Semantic Web and Ontologies, Knowledge Discovery in Graph Data
Language : English
[AI] Optimization (M2)
Coordinators : Auger Anne, Dino Brockhoff
[HCI] Interactive Information Visualization (M2)
Coordinators : Petra Isenberg, Anastasia Bezerianos