Axe Données au LIPN

Table of ContentsClose

Description
Séminaires 2025
- 10/04/25 Présentation de PyXAI
- 06/02/25 Explainable AI and Decision Trees
  - Titles and Abstracts
Séminaires 2024

Description

L’axe données du LIPN a pour but de fédérer les travaux autour de l’apprentissage et des données et de favoriser les collaborations inter-équipes. Nous organisons un séminaire mensuel autour d’un thème pour lequel il existe des travaux dans au moins deux équipes du LIPN.

Organisateurs: Hanène Azzag, Joseph Le Roux
MatterMost: https://mattermost.lipn.univ-paris13.fr/lipn/channels/axe-donnees
BBB: https://bbb.lipn.univ-paris13.fr/b/ler-ppq-nsy-jag
Calendrier: calendar.ics

Séminaires 2025

10/04/25 Présentation de PyXAI

Gilles Audemard
jeudi 10/04/25, 12:30–13:30, B107
lien BBB: https://bbb.lipn.univ-paris13.fr/b/ler-ppq-nsy-jag

Résumé

PyXAI (Python eXplainable AI) est une bibliothèque Python permettant de fournir des explications formelles adaptées aux modèles de machine learning (régression ou classification) basés sur des arbres (Decision Trees, Random Forests, Boosted Trees, etc.). PyXAI génère des explications post-hoc et locales avec des garanties formelles. Elle est également capable de rectifier les modèles lorsque leurs prédictions ne satisfont pas les connaissances de l’utilisateur.

Dans cet exposé, nous présenterons les concepts sous-jacents à PyXAI et illustrerons son fonctionnement à travers des démonstrations.

🔗 Lien vers PyXAI : https://www.cril.univ-artois.fr/software/pyxai/

06/02/25 Explainable AI and Decision Trees

Louenas Bounia, Daniel Kopisitskiy
jeudi 06/02/25, 12:30–13:30, B107
lien BBB: https://bbb.lipn.univ-paris13.fr/b/ler-ppq-nsy-jag

Titles and Abstracts

Daniel will present: “Advantages and Challenges of Optimal Decision Trees as exemplified by the Strong-Max-Flow Formulation”.

The abstract is:

Decision trees are a widely used ML classifier. Heuristic construction methods produce accurate classifiers at the expense of large tree depth and poor interpretability. In recent years, optimal decision trees constructed with mathematical programming have gained traction for their increased accuracy at lower depth and the optimality guarantees that come with solving mathematical programs.

In this talk I will present the Strong-Max-Flow Formulation by Aghaei et al., which stands out as an intuitive construction of an optimal binary decision tree for binary data.

Many subsequent publications have attempted to reduce the computational effort required to solve the mathematical program as the problem is NP-hard and scales very poorly with tree depth. I aim to highlight the challenges and advantages of using optimal decision trees.

Louenas will talk about “Amélioration des explications abductives en intégrant les préférences des utilisateurs : application aux arbres de décision”

Résumé :

Cette présentation explore l’amélioration de l’explicabilité des modèles d’intelligence artificielle, en particulier les arbres de décision, à travers des explications abductives adaptées aux préférences des utilisateurs. Une explication abductive vise à clarifier pourquoi un modèle prend une décision particulière. Toutefois, plusieurs explications peuvent exister pour une même instance de décision, même pour des modèles aussi simples que les arbres de décision. Fournir une seule explication ne permet pas toujours à l’utilisateur de comprendre pleinement la décision de l’IA, car d’autres explications plus pertinentes peuvent exister. Cependant, proposer des milliers, voire des millions, d’explications n’améliore pas non plus la compréhension. C’est pourquoi nous nous concentrons sur un sous-ensemble d’explications qui répondent aux attentes des utilisateurs en termes de clarté et de pertinence, et qui sont considérées comme suffisamment bonnes.

L’objectif est de rendre les décisions des modèles plus transparentes en plaçant l’utilisateur au centre du processus, en offrant des explications personnalisées qui tiennent compte de ses priorités, comme l’importance des caractéristiques ou les pondérations.

Des résultats expérimentaux sur divers ensembles de données démontrent la performance et la scalabilité de ces méthodes dans le contexte des arbres de décision. Ce travail souligne l’importance des préférences des utilisateurs pour rendre les systèmes d’IA plus fiables, compréhensibles et transparents.

Séminaires 2024

20/06/24 Dimensionality Reduction and Alpha-Shape Representation for Pattern Recognition

Djamel Bouchaffra
jeudi 16/05/24, 13:00–15:00, B107
lien BBB: https://bbb.lipn.univ-paris13.fr/b/ler-ppq-nsy-jag

Summary

We introduce a novel formalism that first performs nonlinear dimensionality reduction and then captures topological features (such as the shape depicted by the observed data) in the latent space to conduct pattern classification. This mission is achieved by: (i) reducing the dimension of the observed variables through a kernelized radial basis functions (KRBF) technique and expressing the latent variables probability distribution in terms of the observed variables; (ii) disclosing the data manifold as a 3D polyhedron via the α-shape constructor and extracting topological invariants; and (iii) classifying a data set using a mixture of multinomial distributions. We are exploring this methodology to address the problem of age-invariant face recognition. Some experimental results have demonstrated the efficiency of the proposed methodology when compared to some state-of-the-art approaches.

16/05/24 GLiNER: Generalist Model for Named Entity Recognition using Bidirectional Transformer

Urchade Zaratiana
jeudi 16/05/24, 12:00–13:00, B107 attention à l’horaire inhabituel
lien BBB: https://bbb.lipn.univ-paris13.fr/b/ler-ppq-nsy-jag

Summary

Named Entity Recognition (NER) is essential in various Natural Language Processing (NLP) applications. Traditional NER models are effective but limited to a set of predefined entity types. In contrast, Large Language Models (LLMs) can extract arbitrary entities through natural language instructions, offering greater flexibility. However, their size and cost, particularly for those accessed via APIs like ChatGPT, make them impractical in resource-limited scenarios. In this paper, we introduce a compact NER model trained to identify any type of entity. Leveraging a bidirectional transformer encoder, our model, GLiNER, facilitates parallel entity extraction, an advantage over the slow sequential token generation of LLMs. Through comprehensive testing, GLiNER demonstrate strong performance, outperforming both ChatGPT and fine-tuned LLMs in zero-shot evaluations on various NER benchmarks.

04/04/24 Graph Neural Networks

Hugo Attali, Francesco Demelas, Alexandre Schulz
jeudi 04/04/24, 13:00–15:00, B107
lien BBB: https://bbb.lipn.univ-paris13.fr/b/ler-ppq-nsy-jag

Summary

The success of deep learning in the Euclidean domain prompted a great interest to generalize neural networks to non-Euclidean domains e.g. graph. Given the broad spectrum of problems that can be effectively modeled using graphs, the task of learning representations of nodes, edges, or graphs necessitates the utilization of neural networks tailored for graph structures. Nonetheless, despite their widely use, several challenges persist, notably issues like over-smoothing and over-squashing. The objective of this seminar is to delve into various methodologies for learning representations on graphs while also addressing the limitations in these approaches.

We will present two applications of such methods for combinatorial optimization problems: learning to sparsify MIP formulation by eliminating variables (applied to Multi-Commodity Flow problems) and approximating Lagrangian Multipliers for compact formulations of MILPs, with a concrete application on Multi-Commodity Network design, Generalized Assignment and Capacitated Warehouse Location problems.

07/03/24 Unveiling the Power of Attention: A Deep Dive into Attention Mechanisms

Bilal Faye et Nicolas Floquet
jeudi 07/03/24, 13:00–15:00, B107
enregistrement video
présentation

Summary

Discover the significant influence of attention mechanisms by delving into their fundamental principles and exploring their transformative role. From multimodal fusion, where attention enhances the integration of information across different data types, to applications in graphs, where attention optimizes the analysis of complex, interconnected structures, this presentation provides an overview of the multifaceted nature of this technology.