Supercomputing challenge at historic conference

SC07,
an international conference on high performance
computing, networking, storage, and analysis,
has been held each year since the late 1980s. Now, of
course, a dextrous programmer can put together a system in a garage
with the power of the largest cluster in the
TOP500
list ten years ago. And six teams of student programmers will do
something quite like that this year in the first annual
cluster challenge.

I spoke to Brent Gorda, high performance computing architect at
Livermore Labs, who is coordinating the cluster challenge. SC07 sounds
like the Burning Man of high-performance computing: a silicon-rich
environment that attracts ten to twelve thousand attendees each year,
exercising equal doses of computing brawn and brain, and vendors of
hardware and software vying for space on the huge exhibition
floor. Everybody from financial analysts to climate change modelists
know they need clusters. Performance gains will come increasingly from
parallelism–both at the multicore level and the cluster level–rather
than from faster processors.

The cluster challenge unites each team of students (no one with a
degree can be a team member) with supercomputer vendors to assemble
whatever combination of hardware and software can run off of a 30-amp
circuit. A few universities have held classes in the technologies
involved, but during the 48-hour, round-the-clock challenge, students
are not allowed to get outside help.

The benchmarks will be familiar to the students:

What the students won’t know is the data to be fed into these programs
during the challenge. The team that processes the most data (with
accurate results, of course) will be the winner.

Some experts, upon hearing of the challenge, call it trivial, while
others counsel against participation because it’s impossibly hard. In
any case, employers are already seeking students for employment
opportunities.

Although the challenge is weighted toward scientific applications,
Gorda has seen plenty of applications of clusters in other areas.
Manufacturers are running clusters to simulate their production
processes and the likely outcomes, for instance.

Linux is growing as a favored operating system for clusters, although
any operating system is permissible in the competition. (Apple is
participating, although Microsoft is not.)

Multiprocessing software is stuck in what Gorda believes is an
old-fashioned paradigm. Although multiple libraries and platforms
compete for the programmer’s favor (here I can plug one that we liked
enough to publish a book on,
Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism),
only MPI is really popular. And it’s designed for the
millisecond-speed data exchanges of LANs, not the nanosecond-speed
exchanges of multicore computers.

Scientific applications that are already parallelized also have to be
redesigned. Most break their data sets up into gigabyte-sized chunks,
which fits nicely into the memory of most computers with small numbers
of processors. But once you hit 32 processors, the required 32G memory
is very expensive.

I asked Gorda’s opinion of grid computing over wide-area networks, and
he said they’ve floundered for two reasons: most application designers
expect the fast data transfer times of local networks and multiple
cores, and designing payment models for diverse hardware owners is too
hard. He expects companies to start offering cluster applications such
as the ones mentioned earlier on single-company services such as
Amazon’s EC2.

So those of us who can’t make it to Reno, Nevada on November 10-16
should tune in then to hear about the hardware that tops the 1990s
TOP500.