Shared nothing parallel programming

I agree strongly with Tim and Nathan’s belief in the importance of parallel computing. I’ve been following this space since 2000, when I took Gurusamy Sarathy’s initial work on making perl multi-threaded and finished it for the 5.8 release.

The initial perl threading released in 5.5 had a traditional architecture: all data was shared between all threads. The problem with this approach was the need for continuous synchronization between threads would slow the whole machine down. For 5.8 we revised the plan, and settled on the default of a completely non-shared environment. Each thread had its own context, with its own data space. Only explicitly shared variables were accessible between threads. This let most of the code run at full speed, only paying the synchronization cost when a shared variable was accessed.

I am a firm believer in the shared nothing architecture. Multithreading is hard, with the standard way to solve concurrency problems being to add mutex protection around the non-thread-safe code. Those mutexes allow only one thread to access a particular resource at one time. So imagine your 32-core machine, running an application with 32 threads that uses a mutex to control access to a vital part of the application. All threads need to continuously acquire this mutex, thus creating a bottleneck that allows only a few threads to execute. So your 32 threads, on your 32 core machine, are mostly sitting around waiting for their turn.

With a shared nothing architecture, you can avoid this. If your thread never has to acquire a mutex, it can run at full speed on its assigned CPU. A recent visit to IBM Almaden again underscored the importance of this to me. They showed us a Blue Gene, an awesome beast with 2048 CPUs per rack. Each CPU is a little computer on a chip, with ethernet networking, local interconnects and 512 MB of RAM. They have two of these racks together, and to make it even cooler, you can put 64 of these together for a total of 65536 CPUs. All of these CPUs share no memory, so to implement software on them, you have to use a shared nothing architecture.

The important challenge is not to allow star developers to write multithreaded code; it is to allow the large army of enterprise developers out there to scale their applications to large numbers of cores. Perhaps tools like PeakStream (purchased by Google) or its remaining competitor, RapidMind, can help, but I remain doubtful. I spent a summer reading a printout of all 16,000 lines of perl regular expression code, with a marker pen to find problematic spots. I am unconvinced a tool could have done that for me.

Radar friend Jeff Jonas made me think about this when he posted about performance on his blog. I believe this is direction parallel computing has to go.

Our small database footprint project had the goal of externalizing as much computation off the database engine – pushing this processing into share nothing parallelizable pipelines. So we also did such things as externalized serialization (no more using the database engine to dole out unique record ID’s) and eliminated virtually all stored procedure and triggers – placed more computational weight on these “n” wide pipeline processes instead.

tags: ,

Get the O’Reilly Programming Newsletter

Weekly insight from industry insiders. Plus exclusive content and offers.

  • Bill R

    I agree that parallel computing is the way of the future – but it seems to be a fundamentally hard problem to design “normal” computing tasks in a parallelizable way. I dabbled a bit during the late 80s and early 90s with parallel computing and we don’t seem to have come very far since then.

    At the time, it was best suited to naturally parallel algorithms like solving partial differential equations for physics applications and some search algorithms. The holy grail was a compiler that would just take your program and make it parallel for you. We didn’t have that then and it seems still a long way off.

  • Anonymous for shared nothing programming

  • There are several alternatives emerging that didn’t exist before or where not economically viable because of performance constraints. Message-passing systems have been around for a while but Erlang has done a great job at making it more accessible, while pi-calculus has given it a theory to build on. However, these are not the only options. Software Transacted Memory (STM) is another great contender to look at. It hasn’t yet found its way into the mainstream, but there is a lot of interesting research going on around it.

    I’m a big believer in the REST design pattern, which already builds on message passing and promotes a document-oriented programming model. Coupled with STM to handle atomic, concurrent state transitions, with a dose of compensating transactions for other operations should go a long way to solving it. The nice thing is that all of this can be packaged into a neat programming model that will make threaded programming look like the dark ages of assembly.

  • Shared nothing parallel programming is not just the wave of the future. It’s happening now at large Wall Street firms for their most mission-critical applications (trading) using approaches such as Space-Based Architecture from GigaSpaces.

  • I think you should be careful when saying, on a site as influential as, that any single concurrency model is clearly superior. Shared nothing is conceptually the easiest, but it is not always (or maybe even generally) easy to cleave a problem space into entirely independent sub-domains; a common design result is either very coarse-grained subdomains that don’t actually exploit the parallelism potential or very fine-grained subdomains in which people re-invent synchronization primitives (often badly).

    I think we’re _far_ from knowing how the manycore revolution is going to play out in the mainstream. The experiences of academic and scientific programmers will certainly be valuable, but they are by no means necessarily the heralds of the mainstream future.

  • nadim khemir

    “Share nothing” is a paradigm wich can certainly be applied succesfuly but let’s not forget that many, maybe most, usefull applications are writrn within the “share all” paradigm.

    Unfortunately,the implementation of share nothing threads in Perl is very limiting (memory usage and the unacceptable performance hit which makes un-parallelized apps run faster than parallelized apps). This has lead only to many workaround and generally less interrest in parallelizing applications.

    I believe the prime reason was to make it possible to run existing perl modules in a multi-threaded environment without changes. IMO, It would have been better to mark the modules as “not thread safe” and let the programers deal with that.

    Perl module authors are generally very responsive and would, I like to belive, have made their modules thread safe.

    Making multi-threading more accessible for most is an excellent idea but I believe the star developers needs should be handled too.
    It would be great, in the context of Perl 6, to seen the old, although difficult, “share all/watch your back” model back.