Friday, September 22, 2023

Argonne Team Breaks Record For Globus Data Movement

By ET Bureau - July 09, 2019 3 Mins Read

Argonne Team Breaks Record For Globus Data Movement

2.9PB transfer on Oak Ridge Summit supercomputer is largest ever for Globus; involves three of the largest known cosmological simulations

Globus, the leading research data management service, today announced the largest single file transfer in its history: a team led by Argonne National Laboratory scientists moved 2.9 petabytes of data as part of a research project involving three of the largest cosmological simulations to date.

“Storage is in general a very large problem in our community — the Universe is just very big, so our work can often generate a lot of data,” explained Katrin Heitmann, Argonne physicist and computational scientist and an Oak Ridge National Laboratory Leadership Computing Facility (OLCF) Early Science user. “Using Globus to easily move the data around between different storage solutions and institutions for analysis is essential.”

Read More: Intel to Deliver First Exascale Supercomputer in 2021

The data in question was stored on the Summit supercomputer at OLCF, currently the world’s fastest supercomputer according to the Top500 list published June 18, 2019. Globus was used to move the files from disk to tape, a key use case for researchers.

“Due to its uniqueness, the data is very precious and the analysis will take time,” said Dr. Heitmann. “The first step after the simulations were finished was to make a backup copy of the data to HPSS, so we can move the data back and forth between disk and tape and thus carry out the analysis in steps. We use Globus for this work due to its speed, reliability, and ease of use.”

“With exascale imminent, AI on the rise, HPC systems proliferating, and research teams more distributed than ever, fast, secure, reliable data movement and management are now more important than ever,” said Ian Foster, Globus co-founder and director of Argonne’s Data Science and Learning Division. “We tend to take these functions for granted, and yet modern collaborative research would not be possible without them.”

Read Also: HPe to Buy Cray to Increase Federal and Academia Footprint

“Globus has underpinned groundbreaking research for decades. We could not be prouder of our role in helping scientists do their world-changing work, and we’re happy to see projects like this one continue to push the boundaries of what Globus can achieve. Congratulations to Dr. Heitmann and team!”

When it comes to data transfer performance, “the most important part is reliability,” says Dr. Heitmann. “It is basically impossible for me as a user to check the very large amounts of data upon arrival after a transfer has finished. The analysis of the data often uses a subset of the data, so it would take quite a while until bad data would be discovered and at that point we might not have the data anymore at the source. So the reliability aspects of Globus are key.”

“Of course, speed is also important. If the transfers were very slow, given the amount of data we transfer, we would have had a problem. So it’s good to be able to rely on Globus for fast data movement as well. We are also grateful to Oak Ridge for access to Summit and for their excellent setup of data transfer nodes enabling the use of Globus for HPSS transfers. This work would not have been possible otherwise.”

Read More: Zepl and Snowflake Bring Data Science as a Service to Cloud Data Warehouses

Check Out The New Enterprisetalk Podcast and Follow Enterprisetalk News on Google for more such updates 


ET Bureau

The platform covers e entire enterprise technology space- including emerging technologies like RPA, AI, cloud, automation, and the entire gamut of digital transformation tools, strategies and management decisions.

Subscribe To Newsletter

*By clicking on the Submit button, you are agreeing with the Privacy Policy with Enterprise Talks.*