Time-Sharing Redux for Large-Scale HPC Systems

Abstract

© 2016 IEEE.HPC facilities typically use batch scheduling to space-share jobs. In this paper we revisit time-sharing using a trace of over 2.4 million jobs obtained during 20 months of operation of a modern petascale supercomputer. Our simulations show that batch scheduling produces skewed distributions with much larger slowdowns for shorter-running, larger jobs, whereas time-sharing produces more uniform slowdowns. Consequently, for applications that strong scale, the turnaround time does not scale with batch scheduling, but it does with time-sharing, resulting in turnarounds that are orders of magnitude better at the largest scales. We also show that time-sharing can confer additional benefits in noisy systems and with modern programming practices. Future Exascale HPC systems are expected to exhibit billion-way heterogeneous parallelism and poor performance predictability. As many applications will run in strong scaling, how resource allocation policies affect the experience of supercomputer users has once again become a timely subject.

Links

PhilArchive



    Upload a copy of this work     Papers currently archived: 91,571

External links

Setup an account with your affiliations in order to access resources via your University's proxy server

Through your library

  • Only published works are available at libraries.

Similar books and articles

Compatibility and Time-Sharing in Serial Reaction Time.Steven W. Keele - 1967 - Journal of Experimental Psychology 75 (4):529.
The essential nature of sharing in science.Michael J. Zigmond - 2010 - Science and Engineering Ethics 16 (4):783-799.
Qualities of sharing and their transformations in the digital age.Andreas Wittel - 2011 - International Review of Information Ethics 15 (9):2011.
On the physical basis of cosmic time.Svend E. Rugh & Henrik Zinkernagel - 2009 - Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of Modern Physics 40 (1):1-19.
Integrated A.I. Systems.Kristinn R. Thórisson - 2007 - Minds and Machines 17 (1):11-25.
Insights from ifaluk: Food sharing among cooperative fishers.Richard Sosis - 2004 - Behavioral and Brain Sciences 27 (4):568-569.

Analytics

Added to PP
2017-05-17

Downloads
3 (#1,706,065)

6 months
2 (#1,193,798)

Historical graph of downloads
How can I increase my downloads?

Author's Profile

Estrella Roman
California State University, Los Angeles

Citations of this work

No citations found.

Add more citations

References found in this work

No references found.

Add more references