The third co-design/co-development workshop of the Exa-DI project (Development and Integration) of the PEPR NumPEx was dedicated to “Artificial Intelligence for HPC@Exscale” targeting the two topics “Image analysis @ exascale” and “Data analysis and robust inference @ exascale”. It took place on October 2 and 3, 2024 at the Espace La Bruyère, Du Côté de la Trinité (DCT) in Paris.

 

This face-to-face workshop brought together, for two days, Exa-DI members, members of the other NumPEx projects (Exa-MA: Methods and Algorithms for Exascale, Exa-SofT: HPC Software and Tools, Exa-DoST: Data-oriented Software and Tools for the Exascale and Exa-AToW: Architectures and Tools for Large-Scale Workflows), Application demonstrators (ADs) from various research and industry sectors and Experts to discuss advancements and future directions for integration of Artificial Intelligence into HPC/HPDA workflows at exascale targeting the two topics, “Large image analysis” and “Data analysis and robust inference”.

 

This workshop is the third co-design/co-development workshops in the series whose main objective is to promote software stack co-development strategies to accelerate exascale development and performance portability of computational science and engineering applications. This workshop is a little different from the previous two in that it has a prospective character targeting the increasing importance of rapidly evolving AI-driven and AI-coupled HPC/HPDA workflows in “Large images analysis @ exascale” and “Data analysis (simulation, experiments, observation) & robust inference @ exascale”. Its main objectives are first to co-develop a shared understanding of the different modes of coupling AI into HPC/HPDA workflows, second to co-identify execution motifs most commonly found  in scientific applications in order to drive the co-development of collaborative specific benchmarks or proxy apps allowing to evaluate/measure end-to-end performance of AI-coupled HPC/HPDA workflows and finally, to co-identify  software components (libraries, frameworks, data communication, workflow tools, abstraction layers, programming and execution environments) to be co-developed and integrated to improve critical components and accelerate them.

Key sessions included

  • Introduction and Context: Setting the stage for the workshop’s two main topics as well as presenting the GT IA, a transverse action in NumPEx.
  • Attendees Self-Introduction: Allowing attendees to introduce themselves and their interests.
  • Various Sessions: These sessions featured talks on the challenges to tackle and bottlenecks to overcome (execution speed, scalability, volume of data…), on the type, the format and the volume of data currently investigated, on the frameworks or programming languages ​currently used (e.g. python, pytorch, JAX, C++, etc..) and on the typical elementary operations performed on data.
  • Discussions and Roundtables: These sessions provided opportunities for attendees to engage in discussions and share insights on the presented topics in order to determine a strategy to tackle the challenges in co-design and co-development process.

Invited speakers

  • Jean-Pierre Vilotte from CNRS, member of Exa-DI, who provided the introductory context for the workshop.
  • Thomas Moreau from Inria, member of Exa-DoST, presenting the GT IA, a transverse action in NumPEx.
  • Tobias Liaudat from CEA, discussing fast and scalable uncertainty quantification for scientific imaging.
  • Damien Gradatour from CNRS, addressing how building new brains for giant astronomical telescopes with Deep Neural Networks?
  • Antoine Petiteau from CEA, discussing data analysis for observing the Universe with Graviational Waves at low frequency.
  • Kevin Sanchis from Safran AI, addressing benchmarking self-supervised learning methods in remote sensing.
  • Hugo Frezat from Université Paris Cité, presenting learning subgrid-scale models for turbulent rotating convection.
  • Benoit Semelin from Sorbonne Université, discussing simulation-based inference with cosmological radiative hydrodynamics simulations for SKA.
  • Bruno Raffin & Thomas Moreau from Inria, presenting Machine Learning based analysis of large simulation outputs in Exa-DoST.
  • Julián Tachella from CNRS, presenting DeepInverse: a PyTorch library for solving inverse problems with deep learning.
  • Erwan Allys from ENS-PSL, exploring generative model and component separation in limited data regime with Scattering Transform.
  • François Lanusse from CNRS, discussing multimodal pre-training for Scientific Data: Towards large data models for Astrophysics. > en ligne
  • Christophe Kervazo from Telecom Paris, addressing interpretable and scalable deep learning methods for imaging inverse problems.
  • Eric Anterrieu from CNRS, exploring deep learning based approach in imaging radiometry by aperture synthesis and its implementation.
  • Philippe Ciuciu from CEA, addressing Computational MRI in the deep learning era.
  • Pascal Tremblin from CEA, characterizing patterns in HPC simulations using AI driven image recognition and categorization.
  • Bruno Raffin from Inria, member of Exa-DI, presenting the Software Packaging in Exa-DI

Outcomes and impacts

Many interesting and fruitful discussions took place during this prospective workshop. These discussions allowed us first to progress in understanding the challenges and bottlenecks underpinning AI-driven HPC/HPDA workflows most commonly found in the ADs. Then, a first series of associated issues to be addressed have been identified and these issues can be gathered in two mains axes: (i) image processing of large volumes, images resulting either from simulations or from experiments and (ii) exploration of high-dimensional and multimodal parameter spaces.

One of the very interesting issues that emerged from these discussions concerns the NumPEx software stack and in particular, how could the NumPEx software stack be increased beyond support for classic AI/ML libraries (e.g. TensorFlow, PyTorch) to support concurrent real time coupled execution of AI and HPC/HPDA workflows in ways that allow the AI systems to steer or inform the HPC/HPDA task and vice versa?

A first challenge is the coexistence and communication between HPC/HPDA and AI tasks in the same workflows. This communication is mainly impaired by the difference in programming models used in HPC (i.e., C++, C; and Fortran) and AI (i.e., Python) which requires a more unified data plane management in which high-level data abstractions could be exposed and to hide from both HPC simulations and AI models the complexities of the format conversion and data storage and data storage and transport. A second challenge concerns using the insight provided by the AI models and simulations for identifying execution motifs commonly found in the ADs to guide, steer, or modify the shape of the workflow by triggering or stopping new HPC/HPDA tasks. This implies that the workflow management systems must be able to ingest and react dynamically to inputs coming from the AI models. This should drive the co-development of new libraries, frameworks or workflow tools supporting AI integration into HPC/HPDA workflows.

In addition, these discussions highlighted that an important upcoming action would be to build cross-functional collaboration between software and workflow components development and integration with the overall NumPEx technologies and streamline developer and user workflows.

 

It was therefore decided during this workshop the set-up of a working group addressing these different issues and allowing in fine the building of a suite of shared and well specified proxy-apps and benchmarks, with well-identified data and comparison metrics addressing these different issues. Several teams of ADs and experts have expressed their interest in participating in this working group that will be formed. A first meeting with all interested participants will be organized shortly.

Attendees

  • Jean-Pierre Vilotte, CNRS and member of Exa-DI
  • Valérie Brenner, CEA and member of Exa-DI
  • Jérôme Bobin, CEA and member of Exa-DI
  • Jérôme Charousset, CEA and member of Exa-DI
  • Mark Asch, Université Picardie and member of Exa-DI
  • Bruno Raffin, Inria and member of Exa-DI and Exa-DoST
  • Rémi Baron, CEA and member of Exa-DI
  • Karim Hasnaoui, CNRS and member of Exa-DI
  • Felix Kpadonou, CEA and member of Exa-DI
  • Thomas Moreau, Inria and member of Exa-DoST
  • Erwan Allys, ENS-PSL and application demonstrator
  • Damien Gradatour, CNRS and application demonstrator
  • Antoine Petiteau, CEA and application demonstrator
  • Hugo Frezat, Université Paris Cité and application demonstrator
  • Alexandre Fournier, Institut de physique du globe and application demonstrator
  • Tobias Liaudat, CEA
  • Jonathan Kem, CEA
  • Kevin Sanchis, Safran AI
  • Benoit Semelin, Sorbonne Université
  • Julian Tachella, CNRS
  • François Lanusse, CNRS
  • Christophe Kervazo, Telecom Paris
  • Eric Anterrieu, CNRS
  • Philippe Ciuiciu, CEA
  • Pascal Tremblin, CEA

© Valérie Brenner


NumPEx Newsletter

Subscribe to our newsletter to stay informed on the latest breakthroughs in High-Performance Computing, Exascale research, and cutting-edge digital innovations.