Thursday, May 29, 2014

Big Data vs. HPC

I wrote a blog post awhile back on "HDFS vs. Lustre".

The primary point of that post was that it was not reasonable to compare HDFS to Lustre.  Although I have never worked with other networked file systems like GPFS, Panasas, and pNFS, I believe the same argument can be applied to them as well.  Those networked file systems serve such completely different purposes and have completely different architectures that doing an apples to apples comparison is difficult if not impossible.

So I saw this article recently on Datanami, "Making Hadoop Relevant to HPC".

I felt the need to discuss many of the comments discussed in this article.

Lockwood argues, is that Hadoop “reinvents a lot of functionality that has existed in HPC for decades, and it does so very poorly.”

I can agree that Hadoop reinvents some functionality.  Most notably job scheduling and resource management is something HPC has done for a long time.  However, to my knowledge, HPC has not had a scheduler/resource manager that tightly integrated the filesystem with the job/task scheduling itself.  Therefore the need for the Hadoop community to make their own resource manager.  If you want to criticize the Hadoop community for not using the currently available open source resource managers and writing a plugin?  Ok, that's decently fair.
For example, he said a single Hadoop cluster could support only three concurrent jobs simultaneously. Beyond that, performance suffers.
I'm not really sure where the "three concurrent jobs" comes from.  This makes no sense to me.  I suppose it's possible that Hadoop's default scheduler elects to give priority to jobs differently than what is expected from a traditional HPC scheduler, but that's easily rectified through some mods to the priority queue algorithm.

I can believe that performance may suffer as you add more and more users.  After all, HDFS daemons sit on each node and may get busier and busier as you have more users.  However, I could make the same argument of traditional HPC file systems.  The more and more users you add to them, the busier the file system gets.  At the end of the day, you can only pump so much data through a network link.
Lockwood maintains that Hadoop does not support scalable network topologies like multidimensional meshes.
While technically true,  Big Data applications are programmed and designed in a completely different way.  They may not necessarily benefit from such advanced network topologies.  It's possible Lockwood has some specific applications he's thinking of that could benefit, but I would disagree with this statement for the general problems being handled.
Add to that, the Hadoop Distributed File System (HDFS) “is very slow and very obtuse” when compared with common HPC parallel file systems like Lustre and the General Parallel File System.
Now this comment I'm going to take a little more time to discuss.  Reiterating some of my points from my earlier "HDFS vs. Lustre" post, this is comparing apples to oranges.

The correct comparison is "MapReduce over HDFS/Local Disks vs. MapReduce over Lustre."  This is the real comparison.

MapReduce creates many small files during it's shuffle phase.  Does Lustre/GPFS perform well with small files compared to local disk

MapReduce performs many random-like seeks/reads during its shuffle phase.  Does Lustre/GPFS perform well with random reads compared to local disk?

When your data problem exceeds system memory and you need to spill contents to disk temporarily, will temporary scratch spills be faster to local disk or a networked file system?

I could go on and on and on with this argument.

The point is, is HDFS not as flexible as Lustre or GPFS?  Yes.  But does it serve its purpose better than Lustre/GPFS?  I think the answer is yes it does.

Hopefully in the near future I will be able to point to online published results illustrating this fact.