Open Access BASE2020

Seamlessly Managing HPC Workloads Through Kubernetes

Abstract

[EN] This paper describes an approach to integrate the jobs management of High Performance Computing (HPC) infrastructures in cloud architectures by managing HPC workloads seamlessly from the cloud job scheduler. The paper presents hpc-connector, an open source tool that is designed for managing the full life cycle of jobs in the HPC infrastructure from the cloud job scheduler interacting with the workload manager of the HPC system. The key point is that, thanks to running hpc-connector in the cloud infrastructure, it is possible to reflect in the cloud infrastructure, the execution of a job running in the HPC infrastructure managed by hpc-connector. If the user cancels the cloud-job, as hpc-connector catches Operating System (OS) signals (for example, SIGINT), it will cancel the job in the HPC infrastructure too. Furthermore, it can retrieve logs if requested. Therefore, by using hpc-connector, the cloud job scheduler can manage the jobs in the HPC infrastructure without requiring any special privilege, as it does not need changes on the Job scheduler. Finally, we perform an experiment training a neural network for automated segmentation of Neuroblastoma tumours in the Prometheus supercomputer using hpc-connector as a batch job from a Kubernetes infrastructure. ; The work presented in this article has been partially funded by the regional government of the Comunitat Valenciana (Spain), co-funded by the European Union ERDF funds (European Regional Development Fund) of the Comunitat Valenciana 2014¿2020, with reference IDIFEDER/2018/032 (High-Performance Algorithms for the Modeling, Simulation and early Detection of diseases in Personalized Medicine). The work is also co-funded by PRIMAGE (PRedictive In-silico Multiscale Analytics to support cancer personalised diaGnosis and prognosis, empowered by imaging biomarkers) a Horizon 2020 RIA project funded under the topic SC1-DTH-07-2018 by the European Commission, with grant agreement no: 826494. ; López-Huguet, S.; Segrelles Quilis, JD.; Kasztelnik, M.; Bubak, M.; ...

Problem melden

Wenn Sie Probleme mit dem Zugriff auf einen gefundenen Titel haben, können Sie sich über dieses Formular gern an uns wenden. Schreiben Sie uns hierüber auch gern, wenn Ihnen Fehler in der Titelanzeige aufgefallen sind.