Advancing Science at Exascale – Discover the Latest Research from NumPEx

NumPEx is leading exascale computing research, advancing high-performance computing (HPC) to tackle the biggest scientific and industrial challenges. Our team is working on the latest algorithms, architectures and software optimisations for the next generation of computational science.

Here you will find our scientific publications, hosted on HAL, with the latest results and breakthroughs from the NumPEx community. Whether you are a researcher, an industry professional or simply a curious person interested in exascale computing, these publications will give you a glimpse of the future of HPC.

We are always looking for new talents – students, researchers, engineers – who want to shape the future of computational science. Interested in contributing to this field? Have a look at our work and join us to push the limits of performance and innovation.

NumPEx on HALJoin Us



69 documents

Journal articles

  • Theo Mary, Mantas Mikaitis. Error Analysis of Matrix Multiplication with Narrow Range Floating-Point Arithmetic. SIAM Journal on Scientific Computing, 2025, 47 (4), pp.B785-B800. ⟨10.1137/24M1685109⟩. ⟨hal-04671474v2⟩
  • Danilo Carastan-Santos, Georges da Costa, Igor Fontana de Nardin, Millian Poquet, Krzysztof Rzadca, et al.. Scheduling with lightweight predictions in power-constrained HPC platforms. IEEE Transactions on Parallel and Distributed Systems, 2025, pp.1-12. ⟨10.1109/TPDS.2025.3586723⟩. ⟨hal-04747713v3⟩
  • Nathalie Furmento, Abdou Guermouche, Gwenolé Lucas, Thomas Morin, Samuel Thibault, et al.. Optimizing Parallel Heterogeneous System Efficiency: Dynamic Task Graph Adaptation with Recursive Tasks. Journal of Parallel and Distributed Computing, 2025, 205, pp.105157. ⟨10.1016/j.jpdc.2025.105157⟩. ⟨hal-05199066⟩
  • Theo Mary. Error analysis of the Gram low-rank approximation (and why it is not as unstable as one may think). SIAM Journal on Matrix Analysis and Applications, 2025, 46 (2), pp.1444-1459. ⟨10.1137/24M1687649⟩. ⟨hal-04554516v2⟩
  • Erin Carson, Xinye Chen, Xiaobo Liu. MIXED PRECISION HODLR MATRICES. SIAM Journal on Scientific Computing, In press, 47 (3), pp.A1408-A1435. ⟨10.1137/24M1683925⟩. ⟨hal-05066310⟩
  • Sally Ellingson, Guillaume Pallez. Result-Scalability: Following the Evolution of Selected Social Impact of HPC. International Journal of High Performance Computing Applications, 2025, 39 (5), pp.713-721. ⟨10.1177/10943420251338168⟩. ⟨hal-05037241⟩
  • Jan Boelts, Michael Deistler, Manuel Gloeckler, Álvaro Tejero-Cantero, Jan-Matthis Lueckmann, et al.. sbi reloaded: a toolkit for simulation-based inference workflows. Journal of Open Source Software, 2025, 10 (108), pp.7754. ⟨10.21105/joss.07754⟩. ⟨hal-05072785⟩
  • Amaury Bélières--Frendo, Emmanuel Franck, Victor Michel-Dansac, Yannick Privat. Volume-Preserving Geometric Shape Optimization of the Dirichlet Energy Using Variational Neural Networks. Neural Networks, 2025, 184, pp.106957. ⟨10.1016/j.neunet.2024.106957⟩. ⟨hal-04663197v3⟩
  • Alfredo Buttari, Theo Mary, André Pacteau. Truncated QR factorization with pivoting in mixed precision. SIAM Journal on Scientific Computing, 2025, 47 (2), pp.B382-B401. ⟨10.1137/24M1644705⟩. ⟨hal-04490215v2⟩
  • Georges da Costa. Hardware and application aware performance, power and energy models for modern HPC servers with DVFS. Sustainable Computing : Informatics and Systems, 2025, 46, pp.101106. ⟨10.1016/j.suscom.2025.101106⟩. ⟨hal-04983485⟩
  • Joubine Aghili, Romain Hild, Victor Michel-Dansac, Vincent Vigon, Emmanuel Franck. Accelerating the convergence of Newton's method for nonlinear elliptic PDEs using Fourier neural operators. Communications in Nonlinear Science and Numerical Simulation, 2025, 140 (2), pp.108434. ⟨10.1016/j.cnsns.2024.108434⟩. ⟨hal-04440076v4⟩
  • S. Wang, S. Mignot, S. Prunet, L. Di Mascolo, M. Spinelli, et al.. A Decentralized Framework for Radio-interferometric Image Reconstruction. The Astronomical Journal, 2025, 169, ⟨10.3847/1538-3881/adc57b⟩. ⟨insu-05099859⟩
  • Ha Pham, Florian Faucher, Damien Fournier, Hélène Barucq, Laurent Gizon. Assembling algorithm for Green's tensors and absorbing boundary conditions for Galbrun's equation in radial symmetry. Journal of Computational Physics, 2024, 519, pp.113444. ⟨10.1016/j.jcp.2024.113444⟩. ⟨hal-04503374⟩
  • Thomas Saigre, Christophe Prud'Homme, Marcela Szopos. Model order reduction and sensitivity analysis for complex heat transfer simulations inside the human eyeball. International Journal for Numerical Methods in Biomedical Engineering, 2024, pp.e3864. ⟨10.1002/cnm.3864⟩. ⟨hal-04361954v2⟩
  • Ha Pham, Florian Faucher, Hélène Barucq. Numerical investigation of stabilization in the Hybridizable Discontinuous Galerkin method for linear anisotropic elastic equation. Computer Methods in Applied Mechanics and Engineering, 2024, 428, pp.117080. ⟨10.1016/j.cma.2024.117080⟩. ⟨hal-04503407⟩
  • Robin Boëzennec, Fanny Dufossé, Guillaume Pallez. Qualitatively Analyzing Optimization Objectives in the Design of HPC Resource Manager. ACM Transactions on Modeling and Performance Evaluation of Computing Systems, 2024, 9 (4), pp.1-28. ⟨10.1145/3701986⟩. ⟨hal-04187517v3⟩

Conference papers

  • Robin Boëzennec, Fanny Dufossé, Guillaume Pallez, Alix Tremodeux. Improving Supercomputer Usage with Aging Awareness. Sustainable Supercomputing (Workshop of SC25), Nov 2025, St. Louis, Missouri, United States. ⟨hal-05109521v2⟩
  • Jules Risse, Amina Guermouche, François Trahay. Fine-grain energy consumption modeling of HPC task-based programs. IEEE International Conference on Cluster Computing (CLUSTER 2025), Sep 2025, Edimbourg, United Kingdom. ⟨hal-05200287v2⟩
  • Ahmed Chabib, Roland Greffe, Christophe Geuzaine, Axel Modave. Portage GPU d'un solveur éléments finis discontinus hybridisé pour les problèmes d'ondes en fréquence. CFM 2025 - 26e Congrès Français de Mécanique, Aug 2025, Metz, France. ⟨hal-05235205⟩
  • Atte Torri, Przemysław Dominikowski, Brice Pointal, Oguz Kaya, Laércio Lima Pilla, et al.. Near-Optimal Contraction Strategies for the Scalar Product in the Tensor-Train Format. Euro-Par 2025 - 31 International European Conference on Parallel and Distributed Computing, Aug 2025, Dresden, Germany. pp.63-77, ⟨10.1007/978-3-031-99872-0_5⟩. ⟨hal-05285400⟩
  • Ana Gainaru, Scott Klasky, Guillaume Pallez. Priority-BF: a Task Manager for Priority-Based Scheduling. EURO-PAR 2025 - 31st International European Conference on Parallel and Distributed Computing, Aug 2025, Dresden, Germany. pp.219-232, ⟨10.1007/978-3-031-99854-6_15⟩. ⟨hal-05109295⟩
  • Tanguy Chatelain. Anticipation des communications réseau grâce à la connaissance du futur dans le parallélisme à tâches. COMPAS 2025 - Conférence francophone d'informatique en Parallélisme, Architecture et Système, Jun 2025, Bordeaux, France. ⟨hal-05147860⟩
  • Francieli Boito, Luan Teylo, Mihail Popov, Théo Jolivel, François Tessier, et al.. A Deep Look Into the Temporal I/O Behavior of HPC Applications. 39th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Jun 2025, Milan, Italy. ⟨10.1109/IPDPS64566.2025.00072⟩. ⟨hal-04887809⟩
  • Catherine Guelque, Valentin Honoré, Philippe Swartvagher, Gaël Thomas, François Trahay. PALLAS: a generic trace format for large HPC trace analysis. 39th IEEE International Parallel & Distributed Processing Symposium(IPDPS), Jun 2025, Milan, Italy. ⟨hal-04970114⟩
  • Méline Trochon, Julien Bigot, Virginie Grandgirard, Dorian Midou. Checkpointing Optimisation to Prepare Future Exascale Plasma Turbulence Simulations. IPDPSW 2025 - 39th IEEE International Parallel & Distributed Processing Symposium, IEEE, Jun 2025, Milan, Italy. ⟨hal-05105811⟩
  • Julien Monniot, François Tessier, Henri Casanova, Gabriel Antoniu. Simulation of Large-Scale HPC Storage Systems: Challenges and Methodologies. HiPC 2024 - 31st IEEE International Conference on High Performance Computing, Data, and Analytics, Dec 2024, Bangalore, India. pp.1-11. ⟨hal-04784808⟩
  • Robin Boëzennec, Danilo Carastan-Santos, Fanny Dufossé, Guillaume Pallez. Allocation Strategies for Disaggregated Memory in HPC Systems. HiPC 2024 - 31st IEEE International Conference on High Performance Computing, Data, and Analytics, Dec 2024, Bengalore, India. pp.1-11. ⟨hal-04815672⟩
  • Sofya Dymchenko, Abhishek Purandare, Bruno Raffin. MelissaDL x Breed: Towards Data-Efficient On-line Supervised Training of Multi-parametric Surrogates with Active Learning. AI4S 2024 - 5th Workshop on artificial intelligence and machine learning for scientific applications, Nov 2024, Atlanta (Georgia), United States. pp.1-9. ⟨hal-04712480⟩
  • Houssem Ouertatani, Cristian Maxim, Smail Niar, El-Ghazali Talbi. Accelerated NAS via pretrained ensembles and multi-fidelity Bayesian Optimization. 33rd International Conference on Artificial Neural Networks (ICANN), Sep 2024, Lugano, Switzerland. ⟨10.1007/978-3-031-72332-2_17⟩. ⟨hal-04611343⟩
  • Marc Baboulin, Simplice Donfack, Oguz Kaya, Theo Mary, Matthieu Robeyns. Mixed precision randomized low-rank approximation with GPU tensor cores. Euro-Par 2024: Parallel Processing, Aug 2024, Madrid, Spain. pp.31-44, ⟨10.1007/978-3-031-69583-4_3⟩. ⟨hal-04520893v4⟩
  • Danilo Carastan-Santos, Georges da Costa, Millian Poquet, Patricia Stolf, Denis Trystram. Light-weight prediction for improving energy consumption in HPC platforms. Euro-Par 2024, Carretero, J., Shende, S., Garcia-Blas, J., Brandic, I., Olcoz, K., Schreiber, M., Aug 2024, Madrid, Spain. pp.152-165, ⟨10.1007/978-3-031-69577-3_11⟩. ⟨hal-04566184v2⟩
  • Alexis Bandet, Francieli Boito, Guillaume Pallez. Scheduling distributed I/O resources in HPC systems. 30th International European Conference on Parallel and Distributed Computing 26 - 30 August 2024 Madrid, Spain 30th International European Conference on Parallel and Distributed Computing, Aug 2024, Madrid, Spain. ⟨hal-04394004v2⟩
  • Thomas Firmin, Pierre Boulet, El-Ghazali Talbi. Asynchronous Multi-fidelity Hyperparameter Optimization Of Spiking Neural Networks. International Conference on Neuromorphic Systems (ICONS 2024), Jul 2024, Washington, United States. ⟨hal-04781629⟩
  • Thomas Morin. Optimiser l'Efficacité des Systèmes Parallèles : Adaptation Dynamique des Graphes de Tâches Récursives. COMPAS 2024 - Conférence francophone d'informatique en Parallélisme, Architecture et Système, Jul 2024, Nantes, France. ⟨hal-04672417⟩
  • Nathalie Furmento, Abdou Guermouche, Gwenolé Lucas, Thomas Morin, Samuel Thibault, et al.. Optimizing Parallel System Efficiency: Dynamic Task Graph Adaptation with Recursive Tasks. WAMTA 2024 - Workshop on Asynchronous Many-Task Systems and Applications 2024, Feb 2024, Knoxville, United States. ⟨10.2139/ssrn.5224517⟩. ⟨hal-04548787⟩
  • Houssem Ouertatani, Cristian Maxim, Smail Niar, El-Ghazali Talbi. Bayesian optimization for NAS with pretrained deep ensembles. International Conference in Optimization and Learning (OLA), May 2023, Malaga, Spain. ⟨hal-04076075⟩

Poster communications

  • Xinye Chen, Thibault Hilaire, Fabienne Jézéquel. PROMISE: Floating-point Autotuning with Customized Precisions. Workshop on Approximate Computing in Numerical Linear Algebra, Oct 2025, Paris, France. ⟨hal-05291790⟩
  • Przemysław Dominikowski, Atte Torri, Brice Pointal, Oguz Kaya, Laercio Lima Pilla, et al.. Exploring Near-Optimal Contraction Strategies for the Scalar Product in the Tensor-Train Format. IEEE. IPDPS 2025 - 39th IEEE International Parallel & Distributed Processing Symposium, Jun 2025, Milan, Italy. 2025 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp.1274-1276, ⟨10.1109/IPDPSW66978.2025.00210⟩. ⟨hal-05304665⟩
  • Thomas Saigre, Vincent Chabannes, Giovanna Guidoboni, Christophe Prud'Homme, Marcela Szopos, et al.. Effect of Cooling of the Ocular Surface on Endothelial Cell Sedimentation in Cell Injection Therapy: Insights from Computational Fluid Dynamics. ARVO 2025, May 2025, Salt Lake City, United States. ⟨hal-05072761⟩

Proceedings

  • Raphaël Carpintero Perez, Sébastien da Veiga, Josselin Garnier, Brian Staber. Learning signals defined on graphs with optimal transport and Gaussian process regression. International Conference on Artificial Intelligence and Statistics, May 2025, Phuket, Thailand. 2025. ⟨hal-04740924v2⟩
  • Raphaël Carpintero Perez, Sébastien da Veiga, Josselin Garnier, Brian Staber. Gaussian process regression with Sliced Wasserstein Weisfeiler-Lehman graph kernels. International Conference on Artificial Intelligence and Statistics, May 2024, Valencia, Spain. 2024. ⟨hal-04440186v2⟩
  • T. Firmin, E-G. Talbi. A Comparative Study of Fractal-Based Decomposition Optimization. 6th International Conference on Optimization and Learning, 2023, Malaga, Espagne, France. Optimization and Learning, 6th International Conference, OLA 2023, Malaga, Spain, May 3–5, 2023, Proceedings, 1824, Springer Nature Switzerland, pp.3-20, 2023, Communications in Computer and Information Science, 978-3-031-34020-8. ⟨10.1007/978-3-031-34020-8_1⟩. ⟨hal-04138279⟩

Preprints, Working Papers

  • Théo Beuzeville, Alfredo Buttari, Serge Gratton, Theo Mary. Deterministic and probabilistic rounding error analysis of neural networks in floating-point arithmetic. 2025. ⟨hal-04663142v2⟩
  • Emmanuel Franck, Victor Michel-Dansac, Laurent Navoret, Vincent Vigon. Neural semi-Lagrangian method for high-dimensional advection-diffusion problems. 2025. ⟨hal-05051195v3⟩
  • Raphaël Carpintero Perez, Sébastien Da Veiga, Josselin Garnier. A reproducible comparative study of categorical kernels for Gaussian process regression, with new clustering-based nested kernels. 2025. ⟨hal-05289909⟩
  • Jad Yehya, Mansour Benbakoura, Cédric Allain, Benoît Malezieux, Matthieu Kowalski, et al.. RoseCDL: Robust and scalable convolutional dictionary learning for rare-event detection. 2025. ⟨hal-05250429⟩
  • Emmanuel Agullo, Alfredo Buttari, Abdou Guermouche, Antoine Jego. Redundant computations in task-based parallelism with applications to communication-reducing algorithms. 2025. ⟨hal-05176537⟩
  • Yongseok Jang, Pierre Jolivet, Theo Mary. Mixed Precision Augmented GMRES. 2025. ⟨hal-05163845⟩
  • Hassan Ballout, Yvon Maday, Christophe Prud'Homme. COMBINED GALERKIN AND REGRESSION BASED ALGORITHM FOR PARAMETER DEPENDANT PDE'S *. 2025. ⟨hal-05143191⟩
  • Alfredo Buttari, Xin Liu, Theo Mary, Bastien Vieublé. Mixed precision strategies for preconditioned GMRES: a comprehensive analysis. 2025. ⟨hal-05071696⟩
  • Patrick Amestoy, Antoine Jego, Jean-Yves L'Excellent, Théo Mary, Grégoire Pichon. BLAS-based Block Memory Accessor with Applications to Mixed Precision Sparse Direct Solvers. 2025. ⟨hal-05019106⟩
  • Marc Baboulin, Oguz Kaya, Theo Mary, Matthieu Robeyns. Numerical stability of tree tensor network operations, and a stable rounding algorithm. 2025. ⟨hal-04996127⟩
  • Thomas Saigre, Vincent Chabannes, Christophe Prud'Homme, Marcela Szopos. Mathematical modeling and simulation of coupled aqueous humor flow and temperature distribution in a realistic 3D human eye geometry. 2025. ⟨hal-04918559⟩
  • Hélène Barucq, Michel Duprez, Florian Faucher, Emmanuel Franck, Frédérique Lecourtier, et al.. Enriching continuous Lagrange finite element approximation spaces using neural networks. 2025. ⟨hal-04935072v2⟩
  • Jérémy Berthomieu, Stef Graillat, Dimitri Lesnoff, Theo Mary. Multiword matrix multiplication over large finite fields in floating-point arithmetic. 2025. ⟨hal-04917201⟩
  • Méline Trochon, Jean-Thomas Acquaviva, Francieli Boito, Brice Goglin, François Tessier, et al.. On the Impact of Interference from Concurrent Jobs on Checkpointing Performance. 2025. ⟨hal-05294610⟩
  • Robin Boëzennec, Fernando Fernandes dos Santos, Brice Goglin, Angeliki Kritikakou, Guillaume Pallez, et al.. Increasing the Lifetime of HPC Machines: Issues, Implications, and Open Challenges. 2025. ⟨hal-05312072⟩
  • El-Mehdi El Arar, Silviu-Ioan Filip, Theo Mary, Elisa Riccietti. Mixed precision accumulation for neural network inference guided by componentwise forward error analysis. 2025. ⟨hal-04995708⟩
  • Alfredo Buttari, Nicholas J Higham, Theo Mary, Bastien Vieublé. A modular framework for the backward error analysis of GMRES. 2024. ⟨hal-04525918v2⟩
  • Albert d'Aviau de Piolant, Hayfa Tayeb, Bérenger Bramas, Mathieu Faverge, Abdou Guermouche, et al.. Improving energy efficiency of HPC applications using unbalanced GPU power capping. 2024. ⟨hal-04883872v2⟩
  • Christophe Prud'Homme, Yvon Maday, Hassan Ballout. Nonlinear compressive reduced basis approximation for multi-parameter elliptic problem. 2024. ⟨hal-04628481⟩
  • Christophe Prud'Homme, Vincent Chabannes, Luca Berti, Maryam Maslek, Philippe Pincon, et al.. Ktirio Urban Building: A Computational Framework for City Energy Simulations Enhanced by CI/CD Innovations on EuroHPC Systems. 2024. ⟨hal-04590586⟩
  • Frédéric Nataf, Emile Parolin. Coarse spaces for non-symmetric two-level preconditioners based on local generalized eigenproblems. 2024. ⟨hal-04536547⟩
  • Alexis Bandet, Francieli Boito, Guillaume Pallez. Prediction and Interpretability of HPC I/O Resources Usage with Machine Learning. 2024. ⟨hal-04698511⟩
  • Thomas Firmin, El-Ghazali Talbi. A fractal-based decomposition framework for continuous optimization. 2022. ⟨hal-04474444v2⟩

Reports

  • Georges da Costa, Amina Guermouche. Measurement methods sheet, WG6 Exa-Soft. Université de Toulouse; Université de bordeaux. 2025. ⟨hal-05272179⟩
  • Luc Giraud, Carola Kruse, Paul Mycek, Maksym Shpakovych, Yanfei Xiang. Neural network preconditioning: a case study for the solution of the parametric Helmholtz equation. RR-9593, Inria Centre at the University of Bordeaux, France. 2025. ⟨hal-05157038⟩
  • Francieli Boito, Luan Teylo, Mihail Popov, Théo Jolivel, François Tessier, et al.. A deep look into the temporal I/O behavior of HPC applications -extended version. RR-9577, Inria & Labri, Univ. Bordeaux. 2025, pp.1-42. ⟨hal-04978752⟩
  • Alexis Bandet, Francieli Zanon Boito, Guillaume Pallez. Allocation and Placement Algorithms for Scheduling Distributed I/O Resources in HPC Systems. RR-9549, Inria Bordeaux; Inria Rennes. 2024, pp.1-27. ⟨hal-04593977⟩

Banniere logo CEA CNRS INRIA FR 2030

General Information

Privacy Preference Center