- How Well Does Name Analysis Work? (Pete Warden) — explanation of how those “turn a name into gender/ethnicity/etc” routines work, and how accurate they are. Age has the weakest correlation with names. There are actually some strong patterns by time of birth, with certain names widely recognized as old-fashioned or trendy, but those tend to be swamped by class and ethnicity-based differences in the popularity of names.
- Old Interfaces — a lazy-scrolling interface to Andy Baio’s collection of faux UIs from movies. (via Andy Baio)
- Pidder — browser-crypto’d social network, address book, messaging, RSS reader, and more.
- What I Learned From Researching Almost Every Single Smart Watch That Has Been Rumoured or Announced (Quartz) — interesting roundup of the different display technologies used in each of the smartwatches.
"social graph" entries
Name Analysis, Old UIs, Browser Crypto Social Network, and Smart Watch Displays
- Modeling Users’ Activity on Twitter Networks: Validation of Dunbar’s Number (PLoSone) — In this paper we analyze a dataset of Twitter conversations collected across six months involving 1.7 million individuals and test the theoretical cognitive limit on the number of stable social relationships known as Dunbar’s number. We find that the data are in agreement with Dunbar’s result; users can entertain a maximum of 100–200 stable relationships. Thus, the ‘economy of attention’ is limited in the online world by cognitive and biological constraints as predicted by Dunbar’s theory. We propose a simple model for users’ behavior that includes finite priority queuing and time resources that reproduces the observed social behavior.
- Mary Meeker’s Internet Trends (Slideshare) — check out slide 24, ~2x month-on-month growth for MyFitnessPal’s number of API calls, which Meeker users as a proxy for “fitness data on mobile + wearable devices”.
- What I Learned as an Oompa Loompa (Elaine Wherry) — working in a chocolate factory, learning the differences and overlaps between a web startup and an more traditional physical goods business. It’s so much easier to build a sustainable organization around a simple revenue model. There are no tensions between ad partners, distribution sites, engineering, and sales teams. There are fewer points of failure. Instead, everyone is aligned towards a simple goal: make something people want.
- Augmented Reality Futures (Quartz) — wrap-up of tech in the works and coming. Instruction is the bit that interests me, scaffolding our lives: While it isn’t on the market yet, Inglobe Technologies just previewed an augmented reality app that tracks and virtually labels the components of a car engine in real time. That would make popping the hood of your car on the side of the road much less scary. The app claims to simplify tasks like checking oil and topping up coolant fluid, even for novice mechanics.
Graph data is an area that has attracted many enthusiastic entrepreneurs and developers
The popular open source project GraphLab received a major boost early this week when a new company comprised of its founding developers, raised funding to develop analytic tools for graph data sets. GraphLab Inc. will continue to use the open source GraphLab to “push the limits of graph computation and develop new ideas”, but having a commercial company will accelerate development, and allow the hiring of resources dedicated to improving usability and documentation.
While social media placed graph data on the radar of many companies, similar data sets can be found in many domains including the life and health sciences, security, and financial services. Graph data is different enough that it necessitates special tools and techniques. Because tools were a bit too complex for casual users, in the past this meant graph data analytics was the province of specialists. Fortunately graph data is an area that has attracted many enthusiastic entrepreneurs and developers. The tools have improved and I expect things to get much easier for users in the future. A great place to learn more about tools for graph data, is at the upcoming GraphLab Workshop (on July 1st in SF).
Data wrangling: creating graphs
Before you can take advantage of the other tools mentioned in this post, you’ll need to turn your data (e.g., web pages) into graphs. GraphBuilder is an open source project from Intel, that uses Hadoop MapReduce1 to build graphs out of large data sets. Another option is the combination of GraphX/Spark described below. (A startup called Trifacta is building a general-purpose, data wrangling tool, that could help as well. )
Remixing Success, Scratch in the Browser, 3D Takedown, and Wolfram Network Analysis
- The Remixing Dilemma — summary of research on remixed projects, finding that (1) Projects with moderate amounts of code are remixed more often than either very simple or very complex projects. (2) Projects by more prominent creators are more generative. (3) Remixes are more likely to attract remixers than de novo projects.
- Scratch 2.0 — my favourite first programming language for kids and adults, now in the browser! Downloadable version for offline use coming soon. See the overview for what’s new.
- State Dept Takedown on 3D-Printed Gun (Forbes) — The government says it wants to review the files for compliance with arms export control laws known as the International Traffic in Arms Regulations, or ITAR. By uploading the weapons files to the Internet and allowing them to be downloaded abroad, the letter implies Wilson’s high-tech gun group may have violated those export controls.
- Data Science of the Facebook World (Stephen Wolfram) — More than a million people have now used our Wolfram|Alpha Personal Analytics for Facebook. And as part of our latest update, in addition to collecting some anonymized statistics, we launched a Data Donor program that allows people to contribute detailed data to us for research purposes. A few weeks ago we decided to start analyzing all this data… (via Phil Earnhardt)
A disk-based, single-node, graph analytics system that scales to massive graphs
Designed specifically to run on a single computer with limited memory1 (DRAM), since its release a few months ago GraphChi has been used to analyze graphs with billions of edges. Running on a single machine means deployment and debugging are simpler. In addition it is no longer necessary to find (optimal) graph partitions that minimize communication between compute nodes – the starting point for many distributed graph computations.
The stated goal of GraphChi is to “Compute on graphs with billions of edges, in a reasonable time, on a single PC.” One way to define “reasonable amount of computation time” is to compare against the results produced by other graph processing systems. That’s exactly what GraphChi’s creators did in a recent paper. They found that GraphChi compared favorably to graph analytics packages such as Pegasus and Stanford GPS. While GraphChi was 2-3X slower2 in some cases, it is easier to deploy, easier to debug, and way more energy efficient. Read more…
Computational Social Science, Infrastructure Drives Design, Narcodrones Imminent, and Muscle Memory
- Computational Social Science (Nature) — Facebook and Twitter data drives social science analysis. (via Vaughan Bell)
- The Single Most Important Object in the Global Economy (Slate) — Companies like Ikea have literally designed products around pallets: Its “Bang” mug, notes Colin White in his book Strategic Management, has had three redesigns, each done not for aesthetics but to ensure that more mugs would fit on a pallet (not to mention in a customer’s cupboard). (via Boing Boing)
- Narco Ultralights (Wired) — it’s just a matter of time until there are no humans on the ultralights. Remote-controlled narcodrones can’t be far away.
- Shortcut Foo — a typing tutor for editors, photoshop, and the commandline, to build muscle memory of frequently-used keystrokes. Brilliant! (via Irene Ros)
Terms of Service, Exporting Copyright, Monitoring Networks, and Learning Programming
- The Medium Terms of Service — easily the best terms of service I’ve ever read. Clear and English wherever possible, apologetically lawyered-up CAPITALS where necessary. Buy that lawyer a beer.
- All Nations Lose Under TPP’s Expansion of Copyright Terms (EFF) — leaks reveal the USA negotiators’ predictable attempt to expand the term of copyright in other nations. TPP is a multinational SOPA.
- Network Theory to Identify Origins of Outbreaks (MIT Technology Review) — “By monitoring only 20% of the communities, we achieve an average error of less than 4 hops between the estimated source and the ﬁrst infected community”. The paper says it depends on good knowledge of the network, which makes me wonder how useful it will be for government tracing of Anons and the like.
Weibo cf Twitter, Rendering Fonts, Clothing Manufacturing, and Profiling Python
- Social Media in China (Fast Company) — fascinating interview with Tricia Wang. We often don’t think we have a lot to learn from tech companies outside of the U.S., but Twitter should look to Weibo for inspiration for what can be done. It’s like a mashup of Tumblr, Zynga, Facebook, and Twitter. It’s very picture-based, whereas Twitter is still very text-based. In Weibo, the pictures are right under each post, so you don’t have to make an extra click to view them. And people are using this in subversive ways. Whether you’re using algorithms to search text or actual people–and China has the largest cyber police force in the world—it’s much easier to censor text than images. So people are very subversive in hiding messages in pictures. These pictures are sometimes very different than what people are texting, or will often say a lot more than the actual text itself. (via Tricia Wang)
- A Treatise on Font Rasterisation With an Emphasis on Free Software (Freddie Witherden) — far more than you ever thought you wanted to know about how fonts are rendered. (via Thomas Fuchs)
- Softwear Automation — robots to make clothes, something which is surprisingly rare. (via Andrew McAfee)
- A Guide to Analyzing Python Performance — finding speed and memory problems in your Python code. With pretty pictures! (via Ian Kallen)
A Facebook app organizes your friends via shared interests and experiences.
This week's visualization clusters your Facebook friends based on shared education, location, occupation, and interests.