- Seagate Kinetic Storage — In the words of Geoff Arnold: The physical interconnect to the disk drive is now Ethernet. The interface is a simple key-value object oriented access scheme, implemented using Google Protocol Buffers. It supports key-based CRUD (create, read, update and delete); it also implements third-party transfers (“transfer the objects with keys X, Y and Z to the drive with IP address 126.96.36.199”). Configuration is based on DHCP, and everything can be authenticated and encrypted. The system supports a variety of key schemas to make it easy for various storage services to shard the data across multiple drives.
- Masters of Their Universe (Guardian) — well-written and fascinating story of the creation of the Elite game (one founder of which went on to make the Raspberry Pi). The classic action game of the early 1980s – Defender, Pac Man – was set in a perpetual present tense, a sort of arcade Eden in which there were always enemies to zap or gobble, but nothing ever changed apart from the score. By letting the player tool up with better guns, Bell and Braben were introducing a whole new dimension, the dimension of time.
- Micropolar (github) — A tiny polar charts library made with D3.js.
- Introduction to R (YouTube) — 21 short videos from Google.
Disk Over Ethernet, Inside Elite, Polar Charts, and R Videos
In-memory data management brings data close to the computation.
We wanted to give you a brief update on what we’ve learned so far from our series of interviews with players and practitioners in the in-memory data management space. A few preliminary themes have emerged, some expected, others surprising.
Performance improves as you put data as close to the computation as possible. We talked to people in systems, data management, web applications, and scientific computing who have embraced this concept. Some solutions go to the the lowest level of hardware (L1, L2 cache), The next generation SSDs will have latency performance closer to main memory, potentially blurring the distinction between storage and memory. For performance and power consumption considerations we can imagine a future where the primary way systems are sized will be based on the amount of non-volatile memory* deployed.
Putting data in-memory does not negate the importance of distributed computing environments. Data size and the ability to leverage parallel environments are frequently cited reasons. The same characteristics that make the distributed environments compelling also apply to in-memory systems: fault-tolerance and parallelism for performance. An additional consideration is the ability to gracefully spillover to disk when main is memory full. Read more…
Storage architectures show simplicity's power and how to build clouds at scale.
Simple systems scale effectively, while complex systems struggle to overcome the multiplicative effect of potential failure points. This shows us why the most reliable and scalable clouds are those made up of fewer, simpler parts.
- Terrier IR — open source (Mozilla) text search engine, now with Hadoop support.
- s3ql — open source (GPLv3) Linux filesystem which stores its data on Google Storage, Amazon S3, or OpenStack. (via Adam Shand)
- Julie Learns to Program — blog from our own Julie Steele as she learns her first programming language. The point is: it’s in me. I wasn’t sure that is was, and now I know—it is. And what, exactly, is “it”? It is the bug. It is the combination of native curiosity and stubbornness that made me play around with the code and take some wild guesses instead of running straight to Google (or choosing to stay within the bounds of the exercise). That might sound like a small thing, but I know it is not. I was determined to make the program do what I wanted it to do, I came up with a few guesses as to how to do that, and I kept trying different things until I succeeded (and then I felt thrilled). As much as I have to learn, I know now that I really am hooked. And that I’ll get there.
- WWW::Mechanize::Firefox — Perl module to control Firefox, using the same interface as the WWW::Mechanize web robot module. (via straup on Delicious)
- Anatomy of SSDs — teeth-rattlingly technical Linux Magazine article explaining the different types of SSDs (Solid State Disks–imagine a hard drive made of rapid-access Flash memory). Artur Bergman told me that installing an SSD drive in his MacBook Pro gave the greatest performance increase of any computer upgrade he’d performed since he went from no computer to one.
NASA Cloudware, btrfs, eBook Editing, Exponential Death
- NASA Nebula Services/Platform Stack — The NEBULA platform offers a turnkey Software-as-a-Service experience that can rapidly address the requirements of a large number of projects. However, each component of the NEBULA platform is also available individually; thus, NEBULA can also serve in Platform-as-a-Service or Infrastructure-as-a-Service capacities. Bundles RabbitMQ, Eucalyptus, LUSTRE storage, Fabric deployment, Varnish front-end, MySQL and more. (via Jim Stogdill)
- A Short History of btrfs — Now for some personal predictions (based purely on public information – I don’t have any insider knowledge). Btrfs will be the default file system on Linux within two years. Btrfs as a project won’t (and can’t, at this point) be canceled by Oracle. If all the intellectual property issues are worked out (a big if), ZFS will be ported to Linux, but it will have less than a few percent of the installed base of btrfs. Check back in two years and see if I got any of these predictions right!
- Sigil — open source WYSIWYG eBook editor. (via liza on Twitter)
- Exponential Decay of Life — This startling fact was first noticed by the British actuary Benjamin Gompertz in 1825 and is now called the “Gompertz Law of human mortality.” Your probability of dying during a given year doubles every 8 years. For me, a 25-year-old American, the probability of dying during the next year is a fairly miniscule 0.03% — about 1 in 3,000. When I’m 33 it will be about 1 in 1,500, when I’m 42 it will be about 1 in 750, and so on. (via Hacker News)