Steve Yegge on GROK (YouTube) — The Grok Project is an internal Google initiative to simplify the navigation and querying of very large program source repositories. We have designed and implemented a language-neutral, canonical representation for source code and compiler metadata. Our data production pipeline runs compiler clusters over all Google’s code and third-party code, extracting syntactic and semantic information. The data is then indexed and served to a wide variety of clients with specialized needs. The entire ecosystem is evolving into an extensible platform that permits languages, tools, clients and build systems to interoperate in well-defined, standardized protocols.
Deep Learning for Semantic Analysis — When trained on the new treebank, this model outperforms all previous methods on several metrics. It pushes the state of the art in single sentence positive/negative classification from 80% up to 85.4%. The accuracy of predicting fine-grained sentiment labels for all phrases reaches 80.7%, an improvement of 9.7% over bag of features baselines. Lastly, it is the only model that can accurately capture the effect of contrastive conjunctions as well as negation and its scope at various tree levels for both positive and negative phrases.
Fireshell — workflow tools and framework for front-end developers.
Researchers Can Slip an Undetectable Trojan into Intel’s Ivy Bridge CPUs (Ars Technica) — The exploit works by severely reducing the amount of entropy the RNG normally uses, from 128 bits to 32 bits. The hack is similar to stacking a deck of cards during a game of Bridge. Keys generated with an altered chip would be so predictable an adversary could guess them with little time or effort required. The severely weakened RNG isn’t detected by any of the “Built-In Self-Tests” required for the P800-90 and FIPS 140-2 compliance certifications mandated by the National Institute of Standards and Technology.
rethinkdb — open-source distributed JSON document database with a pleasant and powerful query language.
Teach Kids Programming — a collection of resources. I start on Scratch much sooner, and 12+ definitely need the Arduino, but generally I agree with the things I recognise, and have a few to research …
Fog Creek’s Remote Work Policy — In the absence of new information, the assumption is that you’re producing. When you step outside the HQ work environment, you should flip that burden of proof. The burden is on you to show that you’re being productive. Is that because we don’t trust you? No. It’s because a few normal ways of staying involved (face time, informal chats, lunch) have been removed.
MillWheel (PDF) — a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous ﬂow of records, all within the envelope of the framework’s fault-tolerance guarantees. From Google Research.
Probabilistic Scraping of Plain Text Tables — the method leverages topological understanding of tables, encodes it declaratively into a mixed integer/linear program, and integrates weak probabilistic signals to classify the whole table in one go (at sub second speeds). This method can be used for any kind of classification where you have strong logical constraints but noisy data.
MIT Educational MMO — The initial phase will cover topics in biology, algebra, geometry, probability, and statistics, providing students with a collaborative, social experience in a systems-based game world where they can explore how the world works and discover important scientific concepts. (via KQED)
Changing Norms (Atul Gawande) — neither penalties nor incentives achieve what we’re really after: a system and a culture where X is what people do, day in and day out, even when no one is watching. “You must” rewards mere compliance. Getting to “X is what we do” means establishing X as the norm.
The Mythologies of Big Data (YouTube) — Kate Crawford at UC Berkeley iSchool. The six months: ‘Big data are new’, ‘Big data is objective’, ‘Big data don’t discriminate’, ‘Big data makes cities smart’, ‘Big data is anonymous’, ‘You can opt out of big data’. (via Sam Kinsley)
The STEM Crisis is a Myth (IEEE Spectrum) — Every year U.S. schools grant more STEM degrees than there are available jobs. When you factor in H-1B visa holders, existing STEM degree holders, and the like, it’s hard to make a case that there’s a STEM labor shortage.
Bezos at the Post (Washington Post) — “All businesses need to be young forever. If your customer base ages with you, you’re Woolworth’s,” added Bezos.[...] “The number one rule has to be: Don’t be boring.” (via Julie Starr)
Summingbird (Github) — Twitter open-sourced library that lets you write streaming MapReduce programs that look like native Scala or Java collection transformations and execute them on a number of well-known distributed MapReduce platforms like Storm and Scalding.
Go Ahead, Mess with Texas Instruments (The Atlantic) — School typically assumes that answers fall neatly into categories of “right” and “wrong.” As a conventional tool for computing “right” answers, calculators often legitimize this idea; the calculator solves problems, gives answers. But once an endorsed, conventional calculator becomes a subversive, programmable computer it destabilizes this polarity. Programming undermines the distinction between “right” and “wrong” by emphasizing the fluidity between the two. In programming, there is no “right” answer. Sure, a program might not compile or run, but making it offers multiple pathways to success, many of which are only discovered through a series of generative failures. Programming does not reify “rightness;” instead, it orients the programmer toward intentional reading, debugging, and refining of language to ensure clarity.
When A Spouse Puts On Google Glass (NY Times) — Google Glass made me realize how comparably social mobile phones are. [...] People gather around phones to watch YouTube videos or look at a funny tweet together or jointly analyze a text from a friend. With Glass, there was no such sharing.