Performance Boost for Speech and Text Data Access
Johns Hopkins Deploys Avere for 2X More Grid Jobs, 160 Hours Annual IT Savings
At the Johns Hopkins University Human Language Technology Center of Excellence (HLTCOE), extracting useful information from massive amounts of speech and text data requires equally massive high-performance computing (HPC) and storage infrastructure. The difficulty for the Center’s IT department is to provide these resources within the constraints and uncertainties of a grant-based budget that can vary significantly from year to year.
HLTCOE IT Manager Scott Roberts says Avere Systems storage helps the Center meet both performance and cost objectives. “Deploying an Avere cluster has allowed us to double the number of jobs we can process, from 1.2 million per month to more than 2.4 million. Obtaining equivalent performance via traditional storage would have exceeded our entire annual IT budget by more than 5 times. Avere met our performance requirements in a smaller purchase footprint and assured incremental scalability that protects us against future out-of-band capital expenditures.”
Challenge: Eliminate I/O Bottlenecks, Spend Less
Researchers at the Center of Excellence develop methods for automatically producing useful knowledge from published corpora, including large stores of unstructured text, speech, and document image data in a wide variety of languages and genres. Although humans could potentially extract the same kind of information—for example, scanning hundreds of millions of tweets to glean data about major disease outbreaks or gauge the spread of the flu—the sheer volume of data makes such tasks impractical.
Those large volumes of data also place heavy demand on IT infrastructure. Roberts says that an I/O-bound processing environment began to impact researcher productivity. “Bandwidth was saturated, and we were experiencing a rapidly lengthening queue for HPC grid jobs. The grid workload also impacted responsiveness to user desktops, slowing access to home directories. Our challenge was to find a solution that dramatically improved performance, but that did not require a large upfront investment, a lot of on-going care and feeding, or expensive upgrades at future performance thresholds.”
Solution: Avere for Performance and Value
The Center’s IT team benchmarked potential solutions, including numerous high-end storage arrays and distributed file systems. “We ultimately deployed ZFS, for its scalability, with a commodity core filer front-ended by an Avere cluster. Avere delivers the performance we need while enabling the cost advantages of a backend filer built on more economical, high-density SATA drives.”
Today an Avere FXT 3200 Edge filer cluster front-ends approximately 750TB of raw SATA capacity. The cluster provides high-speed I/O to a 150-node, 2500-core HPC cluster utilizing Sun Grid Engine and Hadoop, as well as 50+ user desktops. The Avere solution also ensures high-speed, 24x7x365 file access to a global user community of full-time, adjunct, and student researchers.
Benefits: Performance, Seamless File Access, Savings
140X More IOPS, 2X More Grid Jobs
To evaluate the Avere cluster performance, Roberts ran a combination of grid jobs and synthetic tests, including the IOZone and Bonnie++ file system benchmarking utilities. “Avere delivered near-linear scalability, boosting performance by nearly 90% with the addition of each node. With the Avere solution, performance surged from 2000 IOPS and 1.2GB/second throughput to 280,000 IOPS and 5.75GB/second throughput. What is most impressive is that we saw that gain in both synthetic benchmarks and grid jobs running under real-world conditions. In the past we have seen cases where achieving vendor-published benchmark results involved some behind-the-scenes smoke and mirrors. Avere delivered exactly the performance promised—with no artifice.”
The Center now processes some 2.4 million jobs per month, representing approximately 15 jobs per minute. “Once the new Avere solution was operational, our researchers changed the types of jobs they ran on our HPC cluster. They had entire workflows that were waiting for the increased performance available from the Avere cluster. As a result, we are now running twice as many grid jobs and have reduced queue wait times from 45 to 21 minutes, even though the new grid jobs demand low latency and high throughput,” says Roberts. “Performance to desktops has also improved dramatically. Editing files on the storage array is now on par with editing local files on a laptop. In a cutting-edge research facility, file system responsiveness and faster job throughput translate directly into researcher productivity. Avere performance enables users to conduct more and higher-quality research, explore more paths, and tackle more complex challenges.”
Seamless Access to 1 Billion Files
The Avere solution also allows the center to consolidate user file access—to more than one billion files ranging up to 1GB in size—through a single mount point. Roberts continues, “The combination of a global namespace and Avere FlashMoveTM software gives us tremendous flexibility to move files around, add capacity, or even take advantage of object-based services like Amazon S3 without impacting user file access or performance. Changes to the underlying storage infrastructure are transparent to researchers, and administration is simplified.”
160 IT Hours Recovered, More Performance at Lower Cost
Roberts estimates that the Avere solution enables recovery of some 160 hours of IT administrative time annually. “We are a very, very small IT shop. One manager, one senior systems administrator, and one part-time student together support all of our researchers, hundreds of compute servers, the network, and more than half a petabyte of storage capacity. We do everything from purchasing desktops to data center design and management. Part of the value of the Avere solution is its administrative simplicity, from initial deployment to on-going maintenance. We actually deployed the Avere cluster ourselves and within two hours were running grid jobs against it. And the on-going savings from not having to move data or links or directories around has been huge, freeing IT staff to work on more strategic projects.”
“Avere delivers the performance that is critical for our research workloads and the economy our budget necessitates. We have successfully broken the cycle of spending larger and larger amounts to get faster and faster arrays. If we need capacity, we add low-cost, high-density drives, and if we need performance, we scale the Avere solution in affordable increments. This solution also helps us control data center space and power costs, delivering significant performance in a very small footprint. Considering the performance and total cost of ownership across the solutions we evaluated, Avere hands-down delivers the highest performance for our researchers at the very best value for the Center.”
About Johns Hopkins
The Johns Hopkins University, founded in 1876, is a world leader in education, research, and patient care. The University enrolls nearly 20,000 full-time and part-time students on U.S. campuses in Maryland and the District of Columbia, and at international facilities in China and Italy. The Human Language Technology Center of Excellence was founded in January of 2007 to focus on research in all aspects of speech and language technologies.