- More Tools for Managing and Reproducing Complex Data Projects (Ben Lorica) — As I survey the landscape, the types of tools remain the same, but interfaces continue to improve, and domain specific languages (DSLs) are starting to appear in the context of data projects. One interesting trend is that popular user interface models are being adapted to different sets of data professionals (e.g. workflow tools for business users).
- Graphical Linear Algebra — or “Graphical The-Subject-That-Kicked-Nat’s-Butt” as I read it.
- Consistent Hashing: A Guide and Go Implementation — easy-to-follow article (and source).
- NoTCP Manifesto — a nice summary of the reasons to build custom protocols over UDP, masquerading as church-nailed heresy. Today’s heresy is just the larval stage of tomorrow’s constricting orthodoxy.
Choosing the right tool for the beginning programmer
You’ve picked the language you want to learn, and you’ve learned more about the various language paradigms. You want to get started writing some actual code—but what tool do you use? With almost all languages, you can start writing code in any old text editor available to you, and that’s what programmers used to do, decades ago. Any good engineer, though, will find tools to make his or her job easier, and that’s where the Integrated Development Environment (IDE) comes into play. So now you need to learn how to use a tool before you can learn the language? Not necessarily. Although many programmers consider “should I use an IDE?” to be a question with an obvious answer, they don’t necessarily agree on what that answer is.
USB in Cars, Capture Presentations, Amazon Redshift, and Polytweeting
- Hyundia Replacing Cigarette Lighters with USB Ports (Quartz) — sign of the times. (via Julie Starr)
- Freeseer — free, open source, cross-platform application that captures or streams your desktop—designed for capturing presentations. Would you like freedom with your screencast?
- Amazon Redshift: What You Need to Know — good write-up of experience using Amazon’s column database.
- GroupTweet — Allow any number of contributors to Tweet from a group account safely and securely. (via Jenny Magiera)
Google Code Analysis, Deep Learning, Front-End Workflow, and SICP in JS
- Steve Yegge on GROK (YouTube) — The Grok Project is an internal Google initiative to simplify the navigation and querying of very large program source repositories. We have designed and implemented a language-neutral, canonical representation for source code and compiler metadata. Our data production pipeline runs compiler clusters over all Google’s code and third-party code, extracting syntactic and semantic information. The data is then indexed and served to a wide variety of clients with specialized needs. The entire ecosystem is evolving into an extensible platform that permits languages, tools, clients and build systems to interoperate in well-defined, standardized protocols.
- Deep Learning for Semantic Analysis — When trained on the new treebank, this model outperforms all previous methods on several metrics. It pushes the state of the art in single sentence positive/negative classification from 80% up to 85.4%. The accuracy of predicting fine-grained sentiment labels for all phrases reaches 80.7%, an improvement of 9.7% over bag of features baselines. Lastly, it is the only model that can accurately capture the effect of contrastive conjunctions as well as negation and its scope at various tree levels for both positive and negative phrases.
- Fireshell — workflow tools and framework for front-end developers.
Defining a powerful toolkit
The rise of the phrase “web platform” over the past few years makes me very happy.
Velocity 2013 Speaker Series
One important thing that shapes the overall single-page application performance is instrumentation of the application code. The most obvious use-case is for analyzing code coverage, particularly when running unit tests and functional tests. Code that never gets executed during the testing process is an accident waiting to happen. While it is unreasonable to have 100% coverage, having no coverage data at all does not provide a lot of confidence. These days, we are seeing easy-to-use coverage tools such as Istanbul and Blanket.js become widespread, and they work seamlessly with popular test frameworks such as Jasmine, Mocha, Karma, and many others.
Instrumented code can be leveraged to perform another type of analysis: run-time scalability. Performance is often measured by the elapsed time, e.g. how long it takes to perform a certain operation. This stopwatch approach only tells half of the story. For example, testing the performance of sorting 10 contacts in 10 ms in an address book application doesn’t tell anything about the complexity of that address book. How will it cope with 100 contacts? 1,000 contacts? Since it is not always practical to carry out a formal analysis on the application code to figure out its complexity, the workaround is to figure out the empirical run-time complexity. In this example, it can be done by instrumenting and monitoring a particular part of the sorting implementation—probably the “swap two entries” function—and watch the behavior with different input sizes.
Complex Exploit, Better Coding Tools, Online Coding Tools, and DIY 3D-Printed Dolls
- Tale of Two Pwnies (Chromium Blog) — So, how does one get full remote code execution in Chrome? In the case of Pinkie Pie’s exploit, it took a chain of six different bugs in order to successfully break out of the Chrome sandbox. Lest you think all attacks come from mouth-breathing script kiddies, this is how the pros do it. (via Bryan O’Sullivan)
- The Future is Specific (Chris Granger) — In traditional web-MVC, the code necessary to serve a single route is spread across many files in many different folders. In a normal editor this means you need to do a lot of context switching to get a sense for everything going on. Instead, this mode replaces the file picker with a route picker, as routes seem like the best logical unit for a website. There’s a revolution coming in web dev tools: we’ve had the programmer adapting to the frameworks with little but textual assistance from the IDE. I am loving this flood of creativity because it has the promise to reduce bugs and increase the speed by which we generate good code.
- Makie — design a doll online, they’ll 3d-print and ship it to you. Hello, future of manufacturing, fancy seeing you in a dollhouse!
There's a big gap between easy-to-use tools and competent programming.
Apple is the latest in a long line of entities that want to bring software development to the masses. Here's why that idea, in general, is doomed to fail.
DNS Benchmarking, Intro to Macroeconomics, Materials-Sensing Cameras, and 3D Printing Lab Messed Around
- Namebench (Google Code) — hunts down the fastest DNS servers for your computer to use. (via Nelson Minar)
- Primer on Macroeconomics (Jig) — reading suggestions for introductions to macroeconomics suitable to understand the financial crisis and proposed solutions. (via Tim O’Reilly)
- Smarter Cameras Plumb Composition — A new type of smarter camera can take a picture but also assess the chemical composition of the objects being imaged. This enables automated inspection systems to discern details that would be missed by conventional cameras. Interesting how cameras are getting smarter: Kinect as other significant case in point. (via Slashdot)
- Not So Open — 3D printing lab at the University of Washington had to stop helping outsiders because of a crazy new IP policy from the university administration. These folks were doing amazing work, developing and sharing recipes for new materials to print with (iced tea, rice flour, and more) (via BoingBoing)