"nosql" entries

Four short links: 4 January 2010

Four short links: 4 January 2010

Code for Speed, Wooden Locks, Font Design, and a Java Distributed Data Store

  1. Why Git Is So Fast — interesting mailing list post about the problems that the JGit folks had when they tried to make their Java version of Git go faster. Higher level languages hide enough of the machine that we can’t make all of these optimizations. A reminder that you must know and control the systems you’re running on if you want to get great performance. (via Hacker News)
  2. Wooden Combination Lock — you’ll easily understand how combination locks work with this find piece of crafty construction work.
  3. From Moleskine to Market — how a leading font designer designs fonts. Fascinating, and beautiful, and it makes me covet his skills.
  4. Terrastore — open source distributed document store, HTTP accessible, data and queries are distributed, built on Terracotta which is built on ehcache (updated: Terracotta has an ehcache plugin, but isn’t built on ehcache). A NoSQL database built on Java tools that serious Java developers respect, the first such one that I’ve noticed (update: I brain-farted: neo4j was definitely on my radar). Notice that all the interesting work going on in the NoSQL arena is happening in open source projects.
Four short links: 11 December 2009

Four short links: 11 December 2009

Real Time Text, NoSQL Reading List, New data.gov, and a Breakdancing Robot

  1. Real Time Text Taskforce — standardising live typing ala EtherPad and Google Wave, for accessibility reasons.
  2. NoSQL Required Reading — papers and presentations to get up to speed in the theory and practice of scalable key-value data stores. (via Hacker News)
  3. It’s Official, data.gov 2.0 is Coming — pointer to the design and philosophy document for the next iteration of data.gov. Interesting to see so much activity on US open government happening now: open government directive and progress report were released, along with a request for ideas on open access to publicly-funded science research.
  4. Breakdancing Robot — we live in the future, and it is good. (via @hollowaynz)

Turning Predictions into Opportunities

The view from the eye of a recession isn't great. When companies are going bust, unemployment growing, and everyone's scouring their budgets for costs to cut, it can be hard to see opportunities. However, when Tim pointed to Stephen O'Grady's fine set of 2010 predictions I found myself popping with "oh, so naturally this will happen next …" thoughts. Think…

Four short links: 13 November 2009

Four short links: 13 November 2009

Open Source Design, Interesting NoSQL Use, Copyright Documentary, Location Intelligence

  1. Open Source Enters The World of Atoms — an academic statistical analysis of open design. We indicated that, in open design communities, tangible objects can be developed in very similar fashion to software; one could even say that people treat a design as source code to a physical object and change the object via changing the source.
  2. Why I Like Redis (Simon Willison) — coherent explanation of why Simon likes and uses a particular nosql system. I can run a long running batch job in one Python interpreter (say loading a few million lines of CSV in to a Redis key/value lookup table) and run another interpreter to play with the data that’s already been collected, even as the first process is streaming data in. I can quit and restart my interpreters without losing any data. And because Redis semantics map closely to Python native data types, I don’t have to think for more than a few seconds about how I’m going to represent my data.
  3. © kiwiright (Vimeo) — short documentary about copyright, made to raise awareness of the issues in New Zealand. (just as applicable to the rest of the world)
  4. Your Movements Speak For Themselves (Jeff Jonas) — Mobile devices in America are generating something like 600 billion geo-spatially tagged transactions per day. Every call, text message, email and data transfer handled by your mobile device creates a transaction with your space-time coordinate (to roughly 60 meters accuracy if there are three cell towers in range), whether you have GPS or not. Got a Blackberry? Every few minutes, it sends a heartbeat, creating a transaction whether you are using the phone or not. If the device is GPS-enabled and you’re using a location-based service your location is accurate to somewhere between 10 and 30 meters. Using Wi-Fi? It is accurate below10 meters. A thought-provoking roundup of the information leakage with modern locative systems. (via TomC on Twitter)
Four short links: 21 September 2009

Four short links: 21 September 2009

Bad Writing, Tech Immigration, Long Tail Fail?, and The Real McKoi

  1. Dan Brown’s 20 Worst Sentences — awful awful writing, and glorious glorious mockery of it.

    Deception Point, chapter 8: Overhanging her precarious body was a jaundiced face whose skin resembled a sheet of parchment paper punctured by two emotionless eyes.

    It’s not clear what Brown thinks ‘precarious’ means here.

  2. From Australia to the World: The Story of Google Maps and Google Wave (PDF, HTML Cached here) — history of Google Maps and Wave, from the creator. This particularly struck me: I know few matters more frustrating than finding funding for a start-up. Immigration tops the list.
  3. Rethinking The Long Tail: How to Define ‘Hits’ and ‘Niches’ — the argument comes down to absolute vs relative measurements of popularity. Anderson says that relative hides too much, because percentages are meaningless in a world of infinite inventory. Researchers respond that hits and niches are defined in absolute numbers (top 10, bottom 100). The real takeaway is that infinite inventory requires excellent discovery tools drawing upon collective intelligence systems (hence the Netflix recommendation contest). (via timoreilly on Twitter)
  4. The Mckoi Database MckoiDDB is a database system used by software developers to create applications that store data over a cluster of machines in a network. It is designed to be used in online environments where there are very large sets of both small and big data items that need to be stored, accessed and indexed efficiently in a network cluster. The focus of the MckoiDDB architecture is to support low latency query performance, provide strong data consistency through snapshot transaction isolation, and provide tools to manage logical data models that may change dramatically in physical network environments that may experience similar dramatic change. (via joshua on delicious)
Four short links: 14 September 2009

Four short links: 14 September 2009

NoSQL, Gov 2.0 Videos, Linux Conf, Geodata Grump

  1. WTF Is A Supercolumn? — Cassandra is a NoSQL database, a triplestore that scales superwell. Because it’s not the usual relational thing we’re accustomed to, the language can be a barrier to learning: ColumnFamily, SuperColumns, and more. This post explains what’s what, with examples. (via joshua on Delicious)
  2. Gov 2.0 Summit Videos — When I grow up, I want to be Clay Shirky, Tom Steinberg, and Carl Malamud. Some videos are up, others coming up soon–stay tuned for Carl’s, which received the only standing O of the show. [updated with link to Carl’s talk when it was released]
  3. linux.conf.au Schedule Posted — bring the thunda down unda in 2010. The schedule was just released.
  4. Transport for London Does Not Like the Ordnance Survey — an Official Information Request yielded the Transport for London response to an Ordnance Survey “strategy consultation”. The OS should appoint an independent body to review their licence documents and pay them based on the number of words deleted. Sound advice too–OS have crippled the geospatial industry in the UK by charging for their (admittedly finely-detailed) data. (via mattb on delicious)
Four short links: 10 September 2009

Four short links: 10 September 2009

Hacktivism, Gov 2.0 Futures, Local Geodata, Cassandra Terminology

  1. A Political Startup (Aaron Swartz) — inside account of his grassroots activism efforts, with clever strategies he used to get the outcomes he wanted. A couple months later, frustrated that Norm Coleman wouldn’t drop his spurious legal challenges against Al Franken being named a Senator, we started NormDollar.com. We asked people to donate a dollar each day Norm Coleman didn’t drop out of the race, money we’d spend electing progressive candidates. It was featured on Hardball and throughout the political press. We also videotaped Norm’s donors’ reactions when we told them about the program. But my favorite was when we presented Norm with a big novelty check for him to sign, representing all the money he’d raised for progressives. Now we had money too.
  2. What Gov 2.0 Is Making Me Think (Quinn Norton) — two short and razor sharp observations on Gov 2.0. Like stages of grief, we need to figure out the stages of internet integration for institutions. I suspect grief is in there.
  3. Northland Regional Council Maps in Koordinates — a staggeringly clueful act by local government in New Zealand, releasing a pile of imagery and map layers under CC-BY license. As we hear about national governments’ Gov 2.0 efforts, it’s worth remembering many more local governments there are–with less money, a different revenue model, and no easy way to reach them all.
  4. Ten Second iPhone Tethering — just did this, and it is awesome. The “download” link takes you to a list of countries, which takes you to a list of telcos, which downloads a config file that gives your phone an “okay to tether” network config for that telco. Some report losing ability to MMS, which I for one won’t notice! (via many, including Engadget)
Four short links: 26 August 2009

Four short links: 26 August 2009

Food, NoSQL, Brain Power, Social Data

  1. Better BBQ Through Chemistry — food is the perfect ground for geek training: there are measurements, there’s science, it’s easy to know whether you’ve succeeded, and you can eat all but the worst of your failures. (via BoingBoing)
  2. NoSQL (East) — conference on East Coast for relationless databases.
  3. Human Brain Processing Speed — clocked at 60bits/second, according to this MIT Technology Review article. Their approach eventually led to Hick’s Law, one of the few laws of experimental psychology. It states that the time it takes to make a choice is linearly related to the entropy of the possible alternatives. The results from various reaction-time experiments seem to show that this is the case. Although one byproduct of this approach is that the results are intimately linked to the type of experiment used to measure the reaction time. And that makes each study peculiarly vulnerable to the idiosyncrasies of the experimental approach. Today, Fermi Moscoso del Prado Martín from the Université de Provence in France proposes a new way to study reaction times by analyzing the entropy of their distribution, rather in the manner of thermodynamics. (via Hacker News)
  4. Truly Social DataData will only be truly social when you can work with it in the kinds of ways we work with information in the real, non-computational, world. In the real world we don’t ask for permission to have an opinion on something, to add to the ball of information surrounding a concept. Our needs don’t have to be anticipated by programmers. We can share information as we please. For example, nobody owns the concept of Barcelona. If I want to essentially “tag” Barcelona as being hot, or noisy, or beautiful, I just do it. I can keep my opinion private, I can share it with certain others, I can hold conflicting opinions, I can organize things in multiple ways at the same time and give things many names.
Four short links: 6 August 2009

Four short links: 6 August 2009

Ancient Language, NoSQL, Molecular Gastronomy, SQL Weirdness

  1. Computers Unlock More Secrets of the Indus Valley ScriptFour-thousand years ago, an urban civilization lived and traded on what is now the border between Pakistan and India. During the past century, thousands of artifacts bearing hieroglyphics left by this prehistoric people have been discovered. Today, a team of Indian and American researchers are using mathematics and computer science to try to piece together information about the still-unknown script. The team led by a University of Washington researcher has used computers to extract patterns in ancient Indus symbols. The study, published this week in the Proceedings of the National Academy of Sciences, shows distinct patterns in the symbols’ placement in sequences and creates a statistical model for the unknown language. (via ACM TechNews)
  2. NoSQL: If Only It Was That Easy — war stories of the problems with nosql systems to handle big throughput. We liked Tokyo Tyrant so much, we put it in production. In fact, every request to AboutUs.org hits Tokyo. One of the uses is as a persistent memcached replacement for caching 10 million+ wiki pages (as a json document of all the pieces of our page, which comes out to around 51gb(edited) of data), and it works great. It runs on a single server, it serves up a single type of data, very quickly, and has been a pleasure to use. We keep other ancillary data sets on some other servers too, and it’s great for this. Tokyo Tyrant is a great example of very performant software, but it doesn’t scale. (via straup on Delicious)
  3. WillPowder — Specialty Powders and Spices from Chef Will Goldfarb — molecular gastronomy products from “the golden boy of pastry”. (via joshua on Delicious)
  4. What is the Deal with NULLs?In the past, I’ve criticized NULL semantics, but in this post I’d just like to explain some corner cases that I think you’ll find interesting, and try to straighten out some myths and misconceptions. […] I believe the above shows, beyond a reasonable doubt, that NULL semantics are unintuitive, and if viewed according to most of the “standard explanations,” highly inconsistent. (via bos on Delicious)
Four short links: 31 July 2009

Four short links: 31 July 2009

NoSQL, Goldman Sachs, Yahoo! Developer Products and Bing, and Alternate Reality

On this day in history, Mt Fuji exploded (781), Daniel Defoe was put in the stocks for seditious libel but was pelted with flowers (1703), the first U.S. patent was issued (1790), and the radio show The Shadow aired for the first time (1930).

  1. Tokyo Cabinet: Beyond Key-Value Store — description of Tokyo Cabinet and code examples in Ruby. More on the nosql move to leave relational databases behind for certain modern problems (such as scaling).
  2. The Great American Bubble Machine (Rolling Stone) — I know it’s old hat, but read it for the poetry if for nothing else. The first thing you need to know about Goldman Sachs is that it’s everywhere. The world’s most powerful investment bank is a great vampire squid wrapped around the face of humanity, relentlessly jamming its blood funnel into anything that smells like money.
  3. Yahoo!’s Developer Program and Bing — note from Yahoo! to developers, saying that YQL, YUI, and Pipes are safe. For SearchMonkey and BOSS they currently do not have anything concrete to tell you. I assume (and hope) that Delicious is a top-level product, not something under “search”. (via Simon Willison)
  4. Preparing Us for AR — (Schulze & Webb) round up of some apps and toys that show what AR might be, unfettered by current day technological constraints.