- Capn Proto — open source faster protocol buffers (binary data interchange format and RPC system).
- Saddle — a high performance data manipulation library for Scala.
- Vega — a visualization grammar, a declarative format for creating, saving and sharing visualization designs. (via Flowing Data)
- dumpmon — Twitter bot that monitors paste sites for password dumps and other sensitive information. Source on github, see the announcement for more.
ENTRIES TAGGED "scala"
Binary Data Is Back, Scala Data, Visualization Grammar, and Pastebin Monitor
Two core Scala libraries support features for mocking and data generation.
Alex Payne on Scala's upside and combining object-oriented and functional capabilities.
Alex Payne, co-author of the "Programming Scala," talks about the advantages of using Scala.
The benefits of functional languages and functional language techniques.
O'Reilly editors Mike Loukides and Mike Hendrickson discuss the advantages of functional programming languages and how functional language techniques can be deployed with almost any language.
Text Analysis Bundle, Scala Probabilistic Modeling, Game Analytics, and Encouraging Writing
- Pattern — a BSD-licensed bundle of Python tools for data retrieval, text analysis, and data visualization. If you were going to get started with accessible data (Twitter, Google), the fundamentals of analysis (entity extraction, clustering), and some basic visualizations of graph relationships, you could do a lot worse than to start here.
- Factorie (Google Code) — Apache-licensed Scala library for a probabilistic modeling technique successfully applied to [...] named entity recognition, entity resolution, relation extraction, parsing, schema matching, ontology alignment, latent-variable generative models, including latent Dirichlet allocation. The state-of-the-art big data analysis tools are increasingly open source, presumably because the value lies in their application not in their existence. This is good news for everyone with a new application.
- Playtomic — analytics as a service for gaming companies to learn what players actually do in their games. There aren’t many fields untouched by analytics.
- Write or Die — iPad app for writers where, if you don’t keep writing, it begins to delete what you wrote earlier. Good for production to deadlines; reflective editing and deep thought not included.
Rare Visualization, Google+ Tech, Scala+Erlang, and In-Database Analytics
- Slopegraphs — a nifty Tufte visualization which conveys rank, value, and delta over time. Includes pointers to how to make them, and guidelines for when and how they work. (via Avi Bryant)
- scalang (github) — a Scala wrapper that makes it easy to interface with Erlang, so you can use two hipster-compliant built-to-scale technologies in the same project. (via Justin Sheehy)
- Madlib — an open-source library for scalable in-database analytics. It provides data-parallel implementations of mathematical, statistical and machine learning methods for structured and unstructured data. (via Mike Loukides)
Java is as much about the JVM as it is the language.
This overview of JVM-based programming compares the relative strengths of the major languages.
DOM Snitch, Hadoop in Scala, Pregel in Hadoop in Scala, Reflections on the Company
- DOM Snitch — an experimental Chrome extension that enables developers and testers to identify insecure practices commonly found in client-side code. See also the introductory post. (via Hacker News)
- Spark — Hadoop-alike in Scala. Spark was initially developed for two applications where keeping data in memory helps: iterative algorithms, which are common in machine learning, and interactive data mining. In both cases, Spark can outperform Hadoop by 30x. However, you can use Spark’s convenient API to for general data processing too. (via Hilary Mason)
- Bagel — an implementation of the Pregel graph processing framework on Spark. (via Oliver Grisel)
- Week 315 (Matt Webb) — read this entire post. It will make you smarter. The company’s decisions aren’t actually the shareholders’ decisions. A company has a culture which is not the simple sum of the opinions of the people in it. A CEO can never be said to perform an action in the way that a human body can be said to perform an action, like picking an apple. A company is a weird, complex thing, and rather than attempt (uselessly) to reduce it to people within it, it makes more sense – to me – to approach it as an alien being and attempt to understand its biology and momentums only with reference to itself. Having done that, we can then use metaphors to attempt to explain its behaviour: we can say that it follows profit, or it takes an innovative step, or that it is middle-aged, or that it treats the environment badly, or that it takes risks. None of these statements is literally true, but they can be useful to have in mind when attempting to negotiate with these bizarre, massive creatures. If anyone wonders why I link heavily to BERG’s work, it’s because they have some incredibly thoughtful and creative people who are focused and productive, and it’s Webb’s laser-like genius that makes it possible. They’re doing a lot of subtle new things and it’s a delight and privilege to watch them grow and reflect.
How Facebook Ships, EU Funds, Bacteria Play, and Screens Capture
- How Facebook Ships Code — all engineers go through 4 to 6 week “Boot Camp” training where they learn the Facebook system by fixing bugs and listening to lectures given by more senior/tenured engineers. estimate 10% of each boot camp’s trainee class don’t make it and are counseled out of the organization. Reminded me of Zappos paying people to leave. (via Hacker News)
- EU Funds Scala — it’s a research project at a university, and just got a big pile of funding from the EU.
- Biotic Games — they make Pong, Pacman, Pinball, etc. from biotech. (via Andy Baio)
- Asleep and Awake (BERG London) — It’s glowing rectangles all the way down: those backlit screens that suck your attention. Matt J described it nicely a few years ago: the iPhone is a beautiful, seductive but jealous mistress that craves your attention, and enslaves you to its jaw-dropping gorgeousness at the expense of the world around you. Reminded me of Jesse Robbins’s great line, “mobile is the opposite of mindful”.
Java's wild ride, multicore drives functional, and a look at how the usual programming suspects stacked up in 2010.
This year brought confusion and chaos in the Java space, continued growth for functional languages due to the attack of multicore, and the usual popularity for all of the dynamic languages we know and love.