On Thneeds and the Death of Display Ads (John Battelle) — the video interstitial. Once anathema to nearly every publisher on the planet, this full page unit is now standard on the New York Times, Wired, Forbes, and countless other publishing sites. And while audiences may balk at seeing a full-page video ad after clicking from a search engine or other referring agent, the fact is, skipping the ad is about as hard as turning the page in a magazine. And in magazines, full page ads work for marketers. If you’d raised a kid on AdBlocker, and then at age 15 she saw the ad-filled Internet for the first time, she’d think her browser had been taken over by malware. (via Tim Bray)
Crowdfunded Genomics — a girl with a never-before-seen developmental disorder had her exome (the useful bits of DNA) sequenced, and a never-before-seen DNA mutation found. The money for it was raised by crowdfunding. (via Ed Yong)
Britain To Provide Free Access to Scientific Publications (Guardian) — the Finch report is being implemented! British universities now pay around £200m a year in subscription fees to journal publishers, but under the new scheme, authors will pay “article processing charges” (APCs) to have their papers peer reviewed, edited and made freely available online. The typical APC is around £2,000 per article.
Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained (PDF) — Microsoft Research dug into A/B tests done on Bing and reveal some subtle truths. The statistical theory of controlled experiments is well understood, but the devil is in the details and the difference between theory and practice is greater in practice than in theory […] Generating numbers is easy; generating numbers you should trust is hard! (via Greg Linden)
Data Sequencing Costs (National Human Genome Research Institute) — Cost-per-megabase and cost-per-genome are dropping faster than Moore’s Law now they’ve introduced “second generation techniques” for sequencing, aka “high-throughput sequencing” or a parallelization of the process. (via JP Rangaswami)
Electric Imp — yet another group working on the necessary middleware for ubiquitous networked devices.
How Big Data Transformed the Dairy Industry (The Atlantic) — cutting-edge genomics company Illumina has precisely one applied market: animal science. They make a chip that measures 50,000 markers on the cow genome for attributes that control the economically important functions of those animals.
The Curious Case of Internet Privacy (Cory Doctorow) — I’m with Cory on the perniciousness of privacy-digesting deals between free sites and users, but I’m increasingly becoming convinced that privacy is built into business models and not technology.
23andMe Disproves Its Own Business Model — a hostile article talking about how there’s little predictive power in genetics for diabetes and Parkinson’s so what’s the point of buying a 23andMe subscription? The wider issue is that, as we’ve known for a while, mapping out your genome only helps with a few clearcut conditions. For most medical things that we care about, environment is critical too–but that doesn’t mean that personalized genomics won’t help us better target therapies.
They Don’t Complain and They Die Quietly (Derek Powazek) — In this hyper-modern age of real-time always-on location-based info-overload, perhaps a moment of true peace and quiet is the greatest gift one can receive.
The Dragon’s DNA (The Economist) — Beijing Genomics Institute putting more DNA-sequencing capacity into the top floor of a refurbished printing works than is available in the whole USA.
Scribd Coding Blog — very interesting blog about the technology behind and inside Scribd. They process over 150M polygons a day, building web fonts from the fonts in PDF files, and tell you why it’s not straightforward. I wish there were more of these genuinely interesting technology blogs from companies that do interesting things.
Membase — an open-source (Apache 2.0 license) distributed, key-value database management system optimized for storing data behind interactive web applications. These applications must service many concurrent users; creating, storing, retrieving, aggregating, manipulating and presenting data in real-time. Supporting these requirements, membase processes data operations with quasi-deterministic low latency and high sustained throughput. (via Hacker News)
Sergey’s Search (Wired) — Sergey Brin, one of the Google founders, learned he had a gene allele that gave him much higher odds of getting Parkinson’s. His response has been to help medical research, both with money and through 23andme. Langston decided to see whether the 23andMe Research Initiative might be able to shed some insight on the correlation, so he rang up 23andMe’s Eriksson, and asked him to run a search. In a few minutes, Eriksson was able to identify 350 people who had the mutation responsible for Gaucher’s. A few clicks more and he was able to calculate that they were five times more likely to have Parkinson’s disease, a result practically identical to the NEJM study. All told, it took about 20 minutes. “It would’ve taken years to learn that in traditional epidemiology,” Langston says. “Even though we’re in the Wright brothers early days with this stuff, to get a result so strongly and so quickly is remarkable.”
Startup.gov (YouTube) — Anil Dash talk at Personal Democracy Forum on applying insights from startups to government. I hope the more people say this, the greater the odds it’ll be acted on.
Open Core Software — Marten Mickos (ex-MySQL) talks up “open core” (open source base, proprietary extensions) as a way to resolve the conflict of “change the world with open source” and “make money”. Brian Aker disagrees: There has been no successful launch of an open core company that has reached any significant size, especially of the size that Marten hints at in the article. My take: there are three reasons for open source (freedoms, price, and development scale) and if you close the source to part of your product then the whole product loses those benefits. If you open source enough that the open source bit has massive momentum, then you probably don’t have enough left proprietary to gain huge financial benefit.
Genome Scan Gives Man Insight Into Future Health Risks — the first completely mapped genome of a healthy person aimed at predicting future health risks. The scan was conducted by a team of Stanford researchers and cost about $50,000. The researchers say they can now predict [his] risk for dozens of diseases and how he might respond to a number of widely used medicines. Personalized medicine takes a step closer, and all powered by massive computational power.
Long Handle on Shorted Digital Object — digital object identifiers, and their relationship to shortener services like bit.ly (in which O’Reilly is an investor). The Handle System is relatively inexpensive, but the costs are now higher than the large scale URL shorteners. According to public tax returns, the DOI Foundation pays CNRI about $500,000 per year to run the DOI resolution system. That works out to about 0.7 cents per thousand resolutions. Compare this to Bit.ly, which has attracted $3.5 million of investment and has resolved about 20 billion shortened links- for a cost of about 0.2 cents per thousand. It remains to be seen whether bit.ly will find a sustainable business model; competing directly with DOI is not an impossibility.
We Are In The Information Business — A well-architected news website leads to content that will keep on providing value, rather than leaving stories to wither away when their immediate news value has faded. Structured content is the stuff that makes a website malleable, rather than cementing you into certain ways of doing things. Structured content is like a big undo button that allows you to reverse decisions and change how your website looks and behaves. Since none of us can predict the future, the freedom to change course as often as we please and not having to worry about escalating legacy costs, well, that’s pretty close to heaven.
Sacramento Credit Union FAQ — The answers to your Security Questions are case sensitive and cannot contain special characters like an apostrophe, or the words “insert,” “delete,” “drop,” “update,” “null,” or “select.” (via Simon Willison)
Salmon Protocol — protocol to unite comments and annotations with original web pages. A distributed solution to the problem that Disqus tackles in a centralised fashion. Important because we’ll all be historians of our earlier lives and dissipated prolific micro-content is a historian’s nightmare.
Gephi — open source (GPLv3) interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs. I believe tools for data exploration, versus static infographics, are the only way to develop a new sense for data. (via mattb on Delicious)
Skinput — a bio-acoustic sensor lets you use your skin to write, tap, drag, etc. See also BBC article. (via Mike Loukides)
Spokeo — abysmal indictment of society, first prize in mankind’s race to the bottom. Uncover personal photos, videos, and secrets … GUARANTEED! Spokeo deep searches within 48 major social networks to find truly mouth-watering news about friends and coworkers. PS, anybody who gives their gmail username and password to a site that specializes in dishing dirt can only be described as a fucking idiot. (via Jim Stogdill, who was equally disappointed in our species)
Biologists rally to sequence ‘neglected’ microbes (Nature) — The Genomic Encyclopedia of Bacteria and Archaea is project to sequence genomes from more branches of the evolutionary tree of life. Eisen’s team selected and sequenced more than 100 ‘neglected’ species that lacked close relatives among the 1,000 genomes already in GenBank. The researchers reported earlier this year at the JGI’s Fourth Annual User Meeting that even mapping the first 56 of these microbes’ genomes increased the rate of discovery of new gene and protein families with new biological properties. It also improved the researchers’ ability to predict the role of genes with unknown functions in already sequenced organisms. (via Jonathan Eisen)
Mail Learning: The What and the How (Simon Cozens) — a few things that a really good mail analysis tool needs to do. I hope that my mail client and server does these out of the box in the next five years.
Introducing the Open Web Foundation Agreement — The Open Web Foundation Agreement itself establishes the copyright and patent rights for a specification, ensuring that downstream consumers may freely implement and reuse the licensed specification without seeking further permission. In addition to the agreement itself, we also created an easy-to-read “Deed” that provides a high level overview of the agreement. Applying the open source approach to better standards.
Complete Genomics publishes in Science on low-cost sequencing of 3 human genomes (press release) — The consumables cost for these three genomes sequenced on the proof-of-principle genomic DNA nanoarrays ranged from $8,005 for 87x coverage to $1,726 for 45x coverage for the samples described in this report. Drive that cost down! There’s a gold rush in biological discovery at the moment as we pick the low-hanging fruit of gross correlations between genome and physiome, but the science to reveal the workings of cause and effect is still in its infancy. We’re in the position of the 18th century natural philosophers who were playing with static electricity, oxygen, anaesthetics, and so on but who lacked today’s deeper insights into physical and chemical structure that explain the effects they were able to obtain. More data at this stage means more low-hanging fruit can be plucked, but the real power comes when we understand “how” and not just “what”. (via BoingBoing)
Far From a Lab? Turn a Cellphone into a Microscope (NY Times) — for some tests, you can use a camphone instead of a microscope. In one prototype, a slide holding a finger prick of blood can be inserted over the phone’s camera sensor. The sensor detects the slide’s contents and sends the information wirelessly to a hospital or regional health center. For instance, the phones can detect the asymmetric shape of diseased blood cells or other abnormal cells, or note an increase of white blood cells, a sign of infection, he said.
Augmented reality helps Marine mechanics carry out repair work (MIT TR) — A user wears a head-worn display, and the AR system provides assistance by showing 3-D arrows that point to a relevant component, text instructions, floating labels and warnings, and animated, 3-D models of the appropriate tools. An Android-powered G1 smart phone attached to the mechanic’s wrist provides touchscreen controls for cueing up the next sequence of instructions. […] The mechanics using the AR system located and started repair tasks 56 percent faster, on average, than when wearing the untracked headset, and 47 percent faster than when using just a stationary computer screen.