Engineer position on benchmarking of applications I/O behavior and storage systems (performance and energy consumption)

Application should be submitted online on the Inria website before September 6, 2024.

Context

Launched in 2023 for a duration of 6 years, The NumPEx PEPR aims to contribute to the design and development of numerical methods and software components that will equip future European Exascale and post-Exascale machines. NumPEx also aims to support scientific and industrial applications in fully exploiting their potentials.

Exa-DoST will address the major data challenges by proposing operational solutions co-designed and validated in French and European applications. This will allow filling the gap left by previous international projects to ensure that French and European needs are taken into account in the roadmaps for building the data-oriented Exascale software stack.

Mission

The recruited person will be responsible for characterizing the I/O behavior of applications that have been chosen as representative of the French HPC workload. This characterization will be done by using profiling tools such as Darshan and Tau, tracing tools such as Recorder, and by inspecting the source code of the applications. We are interested in developping I/O kernels, which are codes that mimic the I/O activities (accesses to persistent data) of the applications and can be used to more easily evaluate them on different platforms.

In addition to that, the person will be responsible for performing experiments on different I/O infrastructures to characterize their behavior and how they are affected by different characteristics of the accesses. For that, existing benchmarks such as IOR and mdtest will be used at first, but new benchmarks may need to be developed.

The selection of benchmarks and access pattern will involve the study of research papers.

Finally, the expected results are a suite of benchmarks that can be easily applied to new platforms, the I/O kernels, a database of obtained results, and a report.

Main activities

Main activities:

  • Studying papers about workload of real large HPC machines and imposed by known classes of applications (for example, machine learning)
  • Running applications and benchmarks on HPC systems using scripts, treating and plotting results
  • Studying large HPC applications (usually written in C/C++ or Fortran) to understand their I/O behavior
  • Development of I/O kernels and benchmarks in C/C++ using MPI-IO
  • Statistical analysis of results and modeling (Python or R).

Additional activities:

  • Writing reports and research papers (Latex)

Required Skills

We are looking for a junior engineer with less than three years of experience and Graduate degree or equivalent.

Technical skills and level required:

  • C/C++
  • scripting (Bash, Python,etc.)
  • Unix: command line, ssh, etc.
  • a plus (not mandatory): using HPC systems, slurm, etc.
  • experience in research, especially in HPC, would also be a plus.

More informations

For more information on the proposed research subject:

For other information, please contact Francieli Zanon-Boito ([email protected]).