word2vec — This tool provides an efficient implementation of the continuous bag-of-words and skip-gram architectures for computing vector representations of words. These representations can be subsequently used in many natural language processing applications and for further research. From Google Research paper Efficient Estimation of Word Representations in Vector Space.
What Every Frontend Developer Should Know about Page Rendering — Rendering has to be optimized from the very beginning, when the page layout is being defined, as styles and scripts play the crucial role in page rendering. Professionals have to know certain tricks to avoid performance problems. This arcticle does not study the inner browser mechanics in detail, but rather offers some common principles.
Cayley — an open-source graph inspired by the graph database behind Freebase and Google’s Knowledge Graph.
Alice in Warningland (PDF) — We performed a field study with Google Chrome and Mozilla Firefox’s telemetry platforms, allowing us to collect data on 25,405,944 warning impressions. We find that browser security warnings can be successful: users clicked through fewer than a quarter of both browser’s malware and phishing warnings and third of Mozilla Firefox’s SSL warnings. We also find clickthrough rates as high as 70.2% for Google Chrome SSL warnings, indicating that the user experience of a warning can have tremendous impact on user behaviour.
Quick DT — open source (Java) decision tree learner.
Revealing Hidden Changes to Supreme Court Opinions — WHEREAS, It is now well-documented that the Supreme Court of the United States makes changes to its opinions after the opinion is published; and WHEREAS, Only “Four legal publishers are granted access to “change pages” that show all revisions. Those documents are not made public, and the court refused to provide copies to The New York Times”; and WHEREAS, git makes it easy to identify when changes have been made; RESOLVED, I shall apply a cron job to at least identify when the actual PDF has changed so everyone can see which documents have changed.
Microsoft’s “Killer” Android Patents Revealed (Ars Technica) — Chinese Government required them disclosed as part of MSFT-Nokia merger. The patent lists are strategically significant, because Microsoft has managed to build a huge patent-licensing business by taxing Android phones without revealing what kind of legal leverage they really have over those phones.
HTTPie — a command line HTTP client, a user-friendly HTTP client.
Apple’s Secure Database for Users (Ian Waring) — excellent breakdown of how Apple have gone out of their way to make their cloud database product safe and robust. They may be slow to “the cloud” but they have decades of experience having users as customers instead of products.
101 Uses for Content Mining — between the list in the post and the comments from readers, it’s a good introduction to some of the value to be obtained from full-text structured and unstructured access to scientific research publications.
3D Printers Have a Lot to Learn from Sewing Machines — Sewing does not create more waste but, potentially, less, and the process of sewing is filled with opportunities for increasing one’s skills and doing it over as well as doing it yourself. What are quilts, after all, but a clever way to use every last scrap of precious fabric? (via Jenn Webb)
3D Petrie Museum — The Petrie Museum of Egyptian Archaeology has one of the largest ancient Egyptian and Sudanese collections in the world and they’ve put 3D models of their goods online. Not (yet) available for download, only viewing which seem a bug.
Sandy Pentland on Wearables (The Verge) — Pentland was also Nathan Eagle’s graduate advisor, and behind the Reality Mining work at MIT. Check out his sociometer: One study revealed that the sociometer helps discern when someone is bluffing at poker roughly 70 percent of the time; another found that a wearer can determine who will win a negotiation within the first five minutes with 87 percent accuracy; yet another concluded that one can accurately predict the success of a speed date before the participants do.
Hardening Android for Security and Privacy — a brilliant project! prototype of a secure, full-featured, Android telecommunications device with full Tor support, individual application firewalling, true cell network baseband isolation, and optional ZRTP encrypted voice and video support. ZRTP does run over UDP which is not yet possible to send over Tor, but we are able to send SIP account login and call setup over Tor independently.
The Great Smartphone War (Vanity Fair) — “I represented [the Swedish telecommunications company] Ericsson, and they couldn’t lie if their lives depended on it, and I represented Samsung and they couldn’t tell the truth if their lives depended on it.” That’s the most catching quote, but interesting to see Samsung’s patent strategy described as copying others, delaying the lawsuits, settling before judgement, and in the meanwhile ramping up their own innovation. Perhaps the other glory part is the description of Samsung employee shredding and eating incriminating documents while stalling lawyers out front. An excellent read.
socketcluster — highly scalable realtime WebSockets based on Engine.io. They have screenshots of 100k messages/second on an 8-core EC2 m3.2xlarge instance.
Machine Learning on a Board — everything good becomes hardware, whether in GPUs or specialist CPUs. This one has a “Machine Learning Co-Processor”. Interesting idea, to package up inputs and outputs with specialist CPU, but I wonder whether it’s a solution in search of a problem. (via Pete Warden)
Jasper Project — an open source platform for developing always-on, voice-controlled applications. Shouting is the new swiping—I eagerly await Gartner touting the Internet-of-things-that-misunderstand-you.