"scale" entries

Four short links: 18 May 2010

Four short links: 18 May 2010

Multitouch Medical Errors, Scaling, Javascript Charts, and Fighting Credit Crunches with Open Data

  1. Tondo Interactive Table to Analyze Medical Errors (MedGadget) — use of a multitouch table to help clinical staff identify and track medical errors. (via IVLINE on Twitter)
  2. Steve Huffman Lessons Learned While at Reddit (SlideShare) — uptime and scale. It’s interesting that most everyone reinvents tuples as a way to scale databases, hence the popularity of NoSQL systems.
  3. HumbleFinance — JavaScript library to render dynamic charts as per Google Finance. (via carlos_d_hoy on Delicious)
  4. Hernando de Soto: Shadow Economies — de Soto is an economist, and this ends up talking about the need for transparency and open data. As long as you don’t know who owns the greatest amount of your assets, there’s no info as to who owns what, who is related to what, you have a shadow economy. We live in one, and it has as a characteristic a permanent credit crunch. We know more about it than you do. Credit crunch is where you don’t know who you’d be lending to, so you don’t lend. It’s permanent, we live with it, and now you’re going to have to learn to live with it too, because until you know who is solvent how can you give anybody credit? You’re flying blind. (via Jon Udell)
Four short links: 28 April 2010

Four short links: 28 April 2010

Fair Use Economy, Deconstituted Appliances, 3D Vision, Redis for Fun and Profit

  1. Fair Use in the US Economy (PDF) — prepared by IT lobby in the US, it’s the counterpart to Big ©’s fictitious billions of dollars of losses due to file sharing. Take each with a grain of salt, but this is interesting because it talks about the industries and businesses that the fair use laws make possible.
  2. Disassembled Household Appliances — neat photos of the pieces in common equipment like waffle irons, sandwich makers, can openers, etc. (via evilmadscientist)
  3. GelSight — gel block on a sheet of glass, lit from below with lights and then scanned with cameras, lets you easily capture 3D qualities of the objects pressed into it. Very cool demo–you can see finger prints, pulse, and even make out designs on a $100 bill.
  4. Redis Tutorial (Simon Willison) — Redis is a very fast collection of useful behaviours wrapped around a distributed key-value store. You get locks, IDs, counters, sets, lists, queues, replication, and more.
Four short links: 9 February 2010

Four short links: 9 February 2010

Government Dashboard, Science Code Errors, Scaling Online Games, Information Theory

  1. Track DC — informative drill-down report from Washington DC government about the different departments. (via Sunlight Labs blog)
  2. Errors in Scientific Software — a 1994 study of scientific software that found inconsistent interfaces (1 in 7 for Fortran, 1 in 37 for C) and poor use of arithmetic such that significant figures declined from 6sf in the data to 1sf in the result. (via “If you’re going to do good science, release the computer code too” in the Guardian)
  3. How Farmville Scales — 75M players/month (28M/day), 1/4 of disk activity is writes, 50% higher load spikes, 3G/s traffic go between Farmville and Facebook at peak, LAMP stack, nagios+munin+puppet. (via Hacker News)
  4. Mathematical Philology — when two manuscripts of the same text differ, which is correct? This PLoSONE paper looked at all such discrepancies in Lucretius’s De Rerum Natura and found that the traditional principle of choosing the more difficult reading (on the grounds that errors are from humans unconsciously simplifying) has a strong information theory justification for it. Interesting to see this less than a week after an MIT Technology Review article on quantum teleportation remarked, There is a growing sense that the properties of the universe are best described not by the laws that govern matter but by the laws that govern information.
Four short links: 4 January 2010

Four short links: 4 January 2010

Code for Speed, Wooden Locks, Font Design, and a Java Distributed Data Store

  1. Why Git Is So Fast — interesting mailing list post about the problems that the JGit folks had when they tried to make their Java version of Git go faster. Higher level languages hide enough of the machine that we can’t make all of these optimizations. A reminder that you must know and control the systems you’re running on if you want to get great performance. (via Hacker News)
  2. Wooden Combination Lock — you’ll easily understand how combination locks work with this find piece of crafty construction work.
  3. From Moleskine to Market — how a leading font designer designs fonts. Fascinating, and beautiful, and it makes me covet his skills.
  4. Terrastore — open source distributed document store, HTTP accessible, data and queries are distributed, built on Terracotta which is built on ehcache (updated: Terracotta has an ehcache plugin, but isn’t built on ehcache). A NoSQL database built on Java tools that serious Java developers respect, the first such one that I’ve noticed (update: I brain-farted: neo4j was definitely on my radar). Notice that all the interesting work going on in the NoSQL arena is happening in open source projects.
Four short links: 22 December 2009

Four short links: 22 December 2009

Trading Systems, Streaming iTunes, Scheduling App, Crowdsourcing Lessons

  1. Trading Shares in Milliseconds (Technology Review) — With the rise of automation, the bulk of U.S. stock trading has moved from the once-crowded floor of Manhattan’s New York Stock Exchange (NYSE) to silent server farms run by exchanges and broker-dealers across the country: the proportion of all trades that the NYSE handles has shrunk from 80 percent in 2005 to 40 percent today. Trading is now essentially a virtual art, and its practitioners put such a premium on speed that NASDAQ has considered issuing equal 100-foot lengths of cable to the brokers who send orders to its exchange servers. (via Hacker News)
  2. Stream iTunes Over SSH — short script that lets you tunnel itunes from one machine to another over ssh (by default iTunes only shares on the local network).
  3. Doodle — simple way to schedule a common meeting time. (via joshua on Delicious)
  4. Crowdsourcing — Simon Willison’s thoughtful “lessons learned” from his crowdsourcing projects at the Guardian. Crowdsourcing is not as simple as “give them a wiki and they will fill it” (this is related to the failed “everyone in the world wants to work on my broken payroll system” theory of open source), and Simon explains some of the subtleties. The reviewing experience the first time round was actually quite lonely. We deliberately avoided showing people how others had marked each page because we didn’t want to bias the results. Unfortunately this meant the site felt like a bit of a ghost town, even when hundreds of other people were actively reviewing things at the same time. For the new version, we tried to provide a much better feeling of activity around the site. We added “top reviewer” tables to every assignment, MP and political party as well as a “most active reviewers in the past 48 hours” table on the homepage (this feature was added to the first project several days too late). User profile pages got a lot more attention, with more of a feel that users were collecting their favourite pages in to tag buckets within their profile.
Four short links: 22 October 2009

Four short links: 22 October 2009

Cognitive Surplus, Scaling, Chinese Blogs, CS Education for Growth

  1. Eight Billion Minutes Spent on Facebook Daily — you weren’t using that cognitive surplus, were you?
  2. How We Made Github Fast — high-level summary is that the new “fast, good, cheap–pick any two” is “fast, new, easy–pick any two”. (via Simon Willison)
  3. Isaac Mao, China, 40M Blogs and CountingToday, there are 40 million bloggers in China and around 200 million blogs, according to Mao. Some blogs survive only a few days before being shut down by authorities. More than 80% of people in China don’t know that the internet is censored in their country. When riots broke out in Xinjiang province this year, the authorities shut down internet access for the whole region. No one could get online.
  4. Congress Endorses CS Education as Driver of Economic Growth — compare to Economist’s Optimism that tech firms will help kick-start economic recovery is overdone.
Four short links: 10 July 2009

Four short links: 10 July 2009

Network File System, Internet Use, Lovelace Comic, Search User Interfaces

  1. Ceph — open source distributed filesystem from UCSC. Ceph is built from the ground up to seamlessly and gracefully scale from gigabytes to petabytes and beyond. Scalability is considered in terms of workload as well as total storage. Ceph is designed to handle workloads in which tens thousands of clients or more simultaneously access the same file, or write to the same directory-usage scenarios that bring typical enterprise storage systems to their knees. (via joshua on delicious)
  2. Daily Internet Activities, 2000-2009 — Pew Charitable Trust’s Internet usage survey. We’ve finally broken 50% of Americans using the Internet daily. Twitter is almost a rounding error. (via dhowell on Twitter)
  3. The Thrilling Adventures of Lovelace and Babbage — fantastic comic, with end-notes that explain how Babbage and Lovelace’s lives and works are reflected in the action of the comic. (via suw on Twitter)
  4. Search User Interfaces — full text of this book about the different (successful and un-) interfaces to search. (via sebchan on Twitter)

Announcing: Spike Night at Velocity

Guest blogger Scott Ruthfield is a Program Committee member of the O’Reilly Velocity: Web Performance & Operations Conference.  Web Operations is not for the casual observer: it’s for a particular kind of adrenaline junkie that’s motivated by graphs and servers spinning out of control.  Jumping in, on-your-feet analysis, and experience-based-experimentation are all part of solving new problems caused by unexpected user and machine behavior,…

Four short links: 8 June 2009

Four short links: 8 June 2009

3D Geometry, The Printable Web, Government Internet Fail, and Real World Cloud Computing

  1. How to Project on 3D Geometry — the fine art (and math) of distorting an image so that it looks undistorted when projected onto a non-flat 3D surface. Confused? See the images below. (via straup on Delicious)
  2. ZinePal — Create your own printable magazine from any online content. (via warrenellis on Delicious)
  3. What The Government Doesn’t Understand About The Internet And What To Do About It — Tom Steinberg from MySociety lays it out. As true for US, NZ, and every other country as it is for the UK (for which it was written). Accept that any state institution that says “we control all the information about X” is going to look increasingly strange and frustrating to a public that’s used to be able to do whatever they want with information about themselves, or about anything they care about (both private and public). This means accepting that federated identity systems are coming and will probably be more successful than even official ID card systems: ditto citizen-held medical records. It means saying “We understand that letting train companies control who can interface with their ticketing systems means that the UK has awful train ticket websites that don’t work as hard as they should to help citizens buy cheaper tickets more easily. And we will change that, now.” What I like about Tom vs the US’s Gov 2.0 is that Tom puts down philosophy that’s hard to argue with, whereas the US is dangerously close to simply focusing on techniques and that’s subvertible.
  4. Real World Cloud Computing — summary from a panel of startups who are using EC2. The lock-in is latency. Transfering data within the Amazon services is free. Transfering data to an Amazon competitor: not free.

Sample distorted and undistorted images

Velocity 2009 – Big Ideas (early registration deadline)

(tag cloud created from Velocity session & speaker information using wordle.net) My favorite interview question to ask candidates is: “What happens when you type www.(amazon|google|yahoo).com in your browser and press return?” While the actual process of serving and rendering a page takes seconds to complete, describing it in real detail can take an hour. A good answer spans every part…