- Druid — open source clustered data store (not key-value store) for real-time exploratory analytics on large datasets.
- It’s Time to Engineer Some Filter Failure (Jon Udell) — Our filters have become so successful that we fail to notice: We don’t control them, They have agendas, and They distort our connections to people and ideas. That idea that algorithms have agendas is worth emphasising. Reality doesn’t have an agenda, but the deployer of a similarity metric has decided what features to look for, what metric they’re optimising, and what to do with the similarity data. These are all choices with an agenda.
- Capstone — open source multi-architecture disassembly engine.
- The Future of Employment (PDF) — We note that this prediction implies a truncation in the current trend towards labour market polarization, with growing employment in high and low-wage occupations, accompanied by a hollowing-out of middle-income jobs. Rather than reducing the demand for middle-income occupations, which has been the pattern over the past decades, our model predicts that computerisation will mainly substitute for low-skill and low-wage jobs in the near future. By contrast, high-skill and high-wage occupations are the least susceptible to computer capital. (via The Atlantic)
ENTRIES TAGGED "realtime"
Real Time Exploratory Analytics, Algorithmic Agendas, Disassembly Engine, and Future of Employment
Malware Industrial Complex, Indies Needed, TV Analytics, and HTTP Benchmarking
- Welcome to the Malware-Industrial Complex (MIT) — brilliant phrase, sound analysis.
- Stupid Stupid xBox — The hardcore/soft-tv transition and any lead they feel they have is simply not defensible by licensing other industries’ generic video or music content because those industries will gladly sell and license the same content to all other players. A single custom studio of 150 employees also can not generate enough content to defensibly satisfy 76M+ customers. Only with quality primary software content from thousands of independent developers can you defend the brand and the product. Only by making the user experience simple, quick, and seamless can you defend the brand and the product. Never seen a better put statement of why an ecosystem of indies is essential.
- Data Feedback Loops for TV (Salon) — Netflix’s data indicated that the same subscribers who loved the original BBC production also gobbled down movies starring Kevin Spacey or directed by David Fincher. Therefore, concluded Netflix executives, a remake of the BBC drama with Spacey and Fincher attached was a no-brainer, to the point that the company committed $100 million for two 13-episode seasons.
- wrk — a modern HTTP benchmarking tool capable of generating significant load when run on a single multi-core CPU. It combines a multithreaded design with scalable event notification systems such as epoll and kqueue.
School District Saves With Open Source, Apple ][ Presentation Tool, Tech Talks, and Realtime Dashboard
- School District Builds Own Software — By taking a not-for-profit approach and using freely available open-source tools, Saanich officials expect to develop openStudent for under $5 million, with yearly maintenance pegged at less than $1 million. In contrast, the B.C. government says it spent $97 million over the past 10 years on the B.C. enterprise Student Information System — also known as BCeSIS — a provincewide system already slated for replacement.
- Giving a Presentation From an Apple ][ — A co-worker used an iPad to give a presentation. I thought: why take a machine as powerful as an early Cray to do something as low-overhead as display slides? Why not use something with much less computing power? From this asoft_presenter was born. The code is a series of C programs that read text files and generate a large Applesoft BASIC program that actually presents the slides. (via Jim Stogdill)
- AirBnB TechTalks — impressive collection of interesting talks, part of the AirBnB techtalks series.
- Gawker’s Realtime Dashboard — this is not just technically and visually cool, but also food for thought about what they’re choosing to measure and report on in real time (new vs returning split, social engagement, etc.). Does that mean they hope to be able to influence those variables in real time? (via Alex Howard)
Free Books, Analytics Goofs, Book Boilerplate, and Learn CS with the Raspberry Pi
- Free Book Sifter — lists all the free books on Amazon, has RSS feeds and newsletters. (via BoingBoing)
- Whom the Gods Would Destroy, They First Give Realtime Analytics — a few key reasons why truly real-time analytics can open the door to a new type of (realtime!) bad decision making. [U]ser demographics could be different day over day. Or very likely, you could see a major difference in user behavior immediately upon releasing a change, only to watch it evaporate as users learn to use new functionality. Given all of these concerns, the conservative and reasonable stance is to only consider tests that last a few days or more.
- Web Book Boilerplate (Github) — uses plain old markdown and generates a well structured HTML version of your written words. Since it’s sitting on top of Pandoc and Grunt, you can easily make your books available for every platform. MIT-style license.
- Raspberry Pi Education Manual (PDF) — from Scratch to Python and HCI all via the Raspberry Pi. Intended to be informative and a series of lessons for teachers and students learning coding with the Raspberry Pi as their first device.
Inside Personalized Advertising, Printing Presses Were Good For The Economy, Digital Access, and Ebooks in Libraries
- Web-Scale User Modeling for Targeting (Yahoo! Research, PDF) — research paper that shows how online advertisers build profiles of us and what matters (e.g., ads we buy from are more important than those we simply click on). Our recent surfing patterns are more relevant than historical ones, which is another indication that value of data analytics increases the closer to real-time it happens. (via Greg Linden)
- Information Technology and Economic Change — research showing that cities which adopted the printing press no prior growth advantage, but subsequently grew far faster than similar cities without printing presses. [...] The second factor behind the localisation of spillovers is intriguing given contemporary questions about the impact of information technology. The printing press made it cheaper to transmit ideas over distance, but it also fostered important face-to-face interactions. The printer’s workshop brought scholars, merchants, craftsmen, and mechanics together for the first time in a commercial environment, eroding a pre-existing “town and gown” divide.
- They Just Don’t Get It (Cameron Neylon) — curating access to a digital collection does not scale.
- Should Libraries Get Out of the Ebook Business? — provocative thought: the ebook industry is nascent, a small number of patrons have ereaders, the technical pain of DRM and incompatible formats makes for disproportionate support costs, and there are already plenty of worthy things libraries should be doing. I only wonder how quickly the dynamics change: a minority may have dedicated ereaders but a large number have smartphones and are reading on them already.
Feedback, Open Source Marketing, Programming in the Browser, and Twitter's Open Source Realtime Engine
- Implicit and Explicit Feedback — for preferences and recommendations, implicit signals (what people clicked on and actually listened to) turn out to be strongly correlated with what they would say if you asked. (via Greg Linden)
- Pivoting to Monetize Mobile Hyperlocal Social Gamification by Going Viral — Schuyler Erle’s stellar talk at the open source geospatial tools conference. Video, may cause your sides to ache.
- Twitter Storm (GitHub) — distributed realtime computation system, intended for realtime what Hadoop is to batch processing. Interesting because you improve most reporting and control systems when you move them closer to real-time. Eclipse-licensed open source.
Hilary Mason on how Bitly applies the Internet's real-time data.
In this interview, Bitly chief scientist and Strata speaker Hilary Mason discusses the application of real-time data and the difference between analytics and data science.
Data and education, real-time data, what publishers can learn from startups.
This week on O'Reilly: We looked at how data can help education, Theo Schlossnagle made the case for real-time business data, and we learned that tech startups can teach publishers a thing or two.
OSCON's co-chairs dig into the OSCON Data program.
OSCON's co-chairs discuss sessions in the OSCON Data conference and the people who might be interested in the associated topics.