OpenRefine — (edited: 7 Dec 2013) Google abandoned Google bought Freebase’s GridWorks, turned it into the excellent Refine tool for working with data sets, now picked up and developed by open source community.
CC 4.0 Out — The 4.0 licenses are extremely well-suited for use by governments and publishers of public sector information and other data, especially for those in the European Union. This is due to the expansion in license scope, which now covers sui generis database rights that exist there and in a handful of other countries.
Algorithms and Accountability — Thus, the appearance of an autocompletion suggestion during the search process might make people decide to search for this suggestion although they didn’t have the intention to. A recent paper by Baker and Potts (2013) consequently questions “the extent to which such algorithms inadvertently help to perpetuate negative stereotypes”. (via New Aesthetic Tumblr)
Udacity/Thrun Profile — A student taking college algebra in person was 52% more likely to pass than one taking a Udacity class, making the $150 price tag–roughly one-third the normal in-state tuition–seem like something less than a bargain. In which Udacity pivots to hiring-sponsored workforce training and the new educational revolution looks remarkably like sponsored content.
Amazon is Building Substations (GigaOm) — the company even has firmware engineers whose job it is to rewrite the archaic code that normally runs on the switchgear designed to control the flow of power to electricity infrastructure. Pretty sure that wasn’t a line item in the pitch deck for “the first Internet bookstore”.
Panoramic Images — throw the camera in the air, get a 360×360 image from 36 2-megapixel lenses. Not sure that throwing was previously a recognised UI gesture.
Science Not as Self-Correcting As It Thinks (Economist) — REALLY good discussion of the shortcomings in statistical practice by scientists, peer-review failures, and the complexities of experimental procedure and fuzziness of what reproducibility might actually mean.
Reproducibility Initiative Receives Grant to Validate Landmark Cancer Studies — The key experimental findings from each cancer study will be replicated by experts from the Science Exchange network according to best practices for replication established by the Center for Open Science through the Center’s Open Science Framework, and the impact of the replications will be tracked on Mendeley’s research analytics platform. All of the ultimate publications and data will be freely available online, providing the first publicly available complete dataset of replicated biomedical research and representing a major advancement in the study of reproducibility of research.
Reimagine the Chemistry Set — $50k prize in contest to design a “chemistry set” type kit that will engage kids as young as 8 and inspire people who are 88. We’re looking for ideas that encourage kids to explore, create, build and question. We’re looking for ideas that honor kids’ curiosity about how things work. Backed by the Moore Foundation and Society for Science and the Public.
BF Skinner’s Baby Make Project (BoingBoing) — I got to read some of Skinner’s original writing on the Air-Crib recently and couple of things stuck out to me. First, it cracked me up. The article, published in 1959 in Cumulative Record, is written in the kind of extra-enthusiastic voice you’re used to hearing Makers use to describe particularly exciting DIY projects.
Redecentralize — project highlighting developers and software that disintermediates the ad-serving parasites preying on our human communication.
The Internet Will Suck All Creative Content Out of the World (David Byrne) — persuasively argued that labels are making all the money from streaming services like Spotify, et al. Musicians are increasingly suspicious of the money and equity changing hands between these services and record labels – both money and equity has been exchanged based on content and assets that artists produced but seem to have no say over. Spotify gave $500m in advances to major labels in the US for the right to license their catalogues.
Your Car is About to go Open Source (ComputerWorld) — an open-source IVI operating system would create a reusable platform consisting of core services, middleware and open application layer interfaces that eliminate the redundant efforts to create separate proprietary systems. Leaving them to differentiate the traditional way: ad-retargeting and spyware.
Steve Yegge on GROK (YouTube) — The Grok Project is an internal Google initiative to simplify the navigation and querying of very large program source repositories. We have designed and implemented a language-neutral, canonical representation for source code and compiler metadata. Our data production pipeline runs compiler clusters over all Google’s code and third-party code, extracting syntactic and semantic information. The data is then indexed and served to a wide variety of clients with specialized needs. The entire ecosystem is evolving into an extensible platform that permits languages, tools, clients and build systems to interoperate in well-defined, standardized protocols.
Deep Learning for Semantic Analysis — When trained on the new treebank, this model outperforms all previous methods on several metrics. It pushes the state of the art in single sentence positive/negative classification from 80% up to 85.4%. The accuracy of predicting fine-grained sentiment labels for all phrases reaches 80.7%, an improvement of 9.7% over bag of features baselines. Lastly, it is the only model that can accurately capture the effect of contrastive conjunctions as well as negation and its scope at various tree levels for both positive and negative phrases.
Fireshell — workflow tools and framework for front-end developers.