- Digital Continuity Conference Proceedings — proceedings from a New Zealand conference on digital archiving, preservation, and access for archives, museums, libraries, etc.
- What Are The Scaling Issues to Keep in Mind While Developing a Social Network Feed? (Quora) — insight into why you see the failwhale. (via kellan on Twitter)
- Fan Feeding Frenzy — Amanda Palmer sells $15k in merch and music in 3m via Bandcamp. Is the record available on iTunes yet? Absolutely not. We have nothing against iTunes, it’ll end up there eventually I’m sure, but it was important for us to do this in as close to a DIY manner as possible. If we were just using iTunes, we couldn’t be doing tie-ins with physical product, monitoring our stats (live), and helping people in real-time when they have a question regarding the service. Being able to do all of those things and having such a transparent format in which to do it has been a dream come true. We all buy stuff on the iTunes store – or AmazonMP3 or whatever – but it’s not THE way artists should be connecting to fans, and it’s certainly not the way someone is going to capture the most revenue on a new release. (via BoingBoing)
- Sad State of Open Source in Android Tablets — With the exception of Barnes & Noble’s Nook e-reader, a device that isn’t even really a tablet, I found one tablet manufacturer who was complying with the minimum of their legal open source requirements under GNU GPL. Let alone supporting community development.
"social graph" entries
There's a difference between people you know and the people you're like.
Social search is similar to pre-Google traditional search: results feel arbitrary and unreliable. But a focus on similarity could push social search into a new phase.
Preservation, Scaling Social Networks, Monetizing Music, and Android Unopened Source
Open Data, Open PCR, Open Sara Winge, and Open Source Big Graph Mining
- Learning from Libraries: the Literacy Challenge of Open Data (David Eaves) — a powerful continuation of the theme from my Rethinking Open Data post. David observes that dumping data over the fence isn’t enough, we must help citizens engage. We have a model for that help, in the form of libraries: We didn’t build libraries for an already literate citizenry. We built libraries to help citizens become literate. Today we build open data portals not because we have a data or public policy literate citizenry, we build them so that citizens may become literate in data, visualization, coding and public policy.
- OpenPCR on Kickstarter — In 1983, Kary Mullis first developed PCR, for which he later received a Nobel Prize. But the tool is still expensive, even though the technology is almost 30 years old. If computing grew at the same pace, we would all still be paying $2,000+ for a 1 MHz Apple II computer. Innovation in biotech needs a kick start!
- Wingeing It — profile of O’Reilly’s wonderful Sara Winge by the ever fabulous Quinn Norton.
- PEGASUS — petascale graph mining toolkit from CMU. See their most recent publication. (via univerself on Delicious)
Legal XML, Big Social Data, Crowdsourcing Tips, Copyright Balkanization
- XML in Legislature/Parliament Environments (Sean McGrath) — quite detailed background on the use of XML in legislation drafting systems, and the problems caused by convention in that world–page/line number citations, in particular. (Quick gloat: NZ’s legislature management system is kick-ass, and soon we’ll switch from print authoritative to digital authoritative)
- Large-Scale Social Media Analysis with Hadoop — In this tutorial we will discuss the use of Hadoop for processing large-scale social data sets. We will first cover the map/reduce paradigm in general and subsequently discuss the particulars of Hadoop’s implementation. We will then present several use cases for Hadoop in analyzing example data sets, examining the design and implementation of various algorithms with an emphasis on social network analysis. Accompanying data sets and code will be made available. (via atlamp on Delicious)
- Breaking Monotony with Meaning; Motivation in Crowdsourcing Markets (Crowdflower) — This finding has important implications for those who employ labor in crowdsourcing markets. Companies and intermediaries should develop an understanding of what motivates the people who work on tasks. Employers must think beyond monetary incentives and consider how they can reward workers through non-monetary incentives such as by changing how workers perceive their task. Alienated workers are less likely to do work if they don’t know the context of the work they are doing and employers may find they can get more work done for the same wages simply by telling turkers why they are working.
- Balkanizing the Web — The very absurdity of the global digital system is revealing itself. It created all the instruments for global access and, then, turned around and arbitrarily restricted its commercial use, paving the way for piracy. Think about it: our broadband networks now allow seamless streaming of films, TV shows, music and, soon, of a variety of multimedia products; we have created sophisticated transaction systems; we are getting extraordinary devices to enjoy all this; there is a growing English-speaking population that, for a significant part of it, is solvent and eager to buy this globalized culture and information. But guess what? Instead of a well-crafted, smoothly flowing distribution (and payment) system, we have these Cupertino, Seattle or Los Angeles-engineered restrictions. The U.S. insists on exporting harsh copyright penalties and restrictions, while not exporting license agreements and Fair Use, so the rest of the world gets very grumpy.
Secrets to Success, Sousveillance, Etherpad Lives, Personal Social Networks
- The Ten Commandments of Rock and Roll (BoingBoing) — ten rules that should be posted in every workplace as a guide to how to fail poisonously.
- Snapscouts — rather creepy sousveillance site. It’s up to you to keep America safe! If you see something suspicious, Snap it! If you see someone who doesn’t belong, Snap it! Not sure if someone or something is suspicious? Snap it anyway! I like the idea of promoting a shared interest in keeping us all safe, but I’m not sure SnapScouts is there yet. (update: Ha, it’s a brilliant joke! See the comments for more)
- Diaspora Kickstarter Project — team looking for seed funding to write an aGPLed “privacy aware, personally controlled, do-it-all distributed open source social network” (no news of dessert topping or floor wax applicability). Received 2.5x their requested funding in a few days.
Gov App Building, Android FPS, Graph Mining, Keeping Fit
- Who Is Going To Build The New Public Services? — a thoughtful exploration of the possibilities and challenges of third parties building public software systems. There’s a lot of talk of “just put up the data and we’ll build the apps” but I think this is a more substantial consideration of which apps can be built by whom.
- Quake 3 for Android — kiss the weekend goodbye, NexusOne owners! My theory is that no platform has “made it” until a first person shooter has been ported to it. (via BoingBoing)
- Graph Mining — slides and reading list from seminar series at UCSB on different aspects of mining graphs. Relevant because, obviously, social networks are one such graph to be mined.
- Treadmill Desk — I want one. Staying fit while working at a sedentary job is important but not easy. I tried to type while using a stepper, but that’s just a recipe for incomprehensible typing fail. (via BoingBoing)
Social Network Search for Morons, Bulking Up Bio Data, Better E-Mail, Better Standards
- Spokeo — abysmal indictment of society, first prize in mankind’s race to the bottom. Uncover personal photos, videos, and secrets … GUARANTEED! Spokeo deep searches within 48 major social networks to find truly mouth-watering news about friends and coworkers. PS, anybody who gives their gmail username and password to a site that specializes in dishing dirt can only be described as a fucking idiot. (via Jim Stogdill, who was equally disappointed in our species)
- Biologists rally to sequence ‘neglected’ microbes (Nature) — The Genomic Encyclopedia of Bacteria and Archaea is project to sequence genomes from more branches of the evolutionary tree of life. Eisen’s team selected and sequenced more than 100 ‘neglected’ species that lacked close relatives among the 1,000 genomes already in GenBank. The researchers reported earlier this year at the JGI’s Fourth Annual User Meeting that even mapping the first 56 of these microbes’ genomes increased the rate of discovery of new gene and protein families with new biological properties. It also improved the researchers’ ability to predict the role of genes with unknown functions in already sequenced organisms. (via Jonathan Eisen)
- Mail Learning: The What and the How (Simon Cozens) — a few things that a really good mail analysis tool needs to do. I hope that my mail client and server does these out of the box in the next five years.
- Introducing the Open Web Foundation Agreement — The Open Web Foundation Agreement itself establishes the copyright and patent rights for a specification, ensuring that downstream consumers may freely implement and reuse the licensed specification without seeking further permission. In addition to the agreement itself, we also created an easy-to-read “Deed” that provides a high level overview of the agreement. Applying the open source approach to better standards.
Social Media Parasites, Open Government Data, Prime Numbers, Amazon Image Abuse
- I’m Tired of Your Analogue Attitude — hilarious animated clip about social media gurus, made using xtranormal. (via trib on twitter)
- Three Laws of Open Government Data — 1. If it can’t be spidered or indexed, it doesn’t exist; 2. If it isn’t available in open and machine readable format, it can’t engage; 3. If a legal framework doesn’t allow it to be repurposed, it doesn’t empower. (also see slide deck)
- Structure and Randomness in the Prime Numbers — paper about some of the fun mathematics around prime numbers. (via Hacker News)
- Abusing Amazon Images — decoding and doing fun things with the Amazon images API. The cool thing (if you want to generate unlikely Amazon images) is that you’re not limited to one use of any of these commands. You can have multiple discounts, multiple shadows, multiple bullets, generating images that Amazon would never have on its site. However, every additional command you add generates another 10% to the image dimensions, adding white space around the image. And that 10% compounds; add a lot of bullets, and you’ll find that you have a small image in a large blank space. (You can use the CR command to cut away the excess, however.) Note also that the commands are interpreted in order, which can have an impact on what overlaps what.
DIY SPY, Screencasting, Social Network Analysis, Term Extraction
- DIY SPY – a homebrew 2.4GHz wi-fi spectrum analyzer — As proof of concept (and a cool toy for anyone who has one of these lying around), I have implemented a working Wi-Fi spectrum analyzer on TI’s ez430-RF2500 development kit ($50), a 2-part USB dongle which consists essentially of a CC2500 radio strapped to an MSP430 low-power microcontroller (detachable bottom half) and a USB interface which enumerates as a virtual serial port (top half). The top half doubles as a standalone MSP430 programmer, so this kit is a great cheap way to get started playing with them. (via joshua on Delicious)
- Screenr — Instant screencasts for Twitter. Flash-based, uploads to their site and tweets the URL. The whole “for Twitter” thing is going a little too far: who records screencasts only for Twitter? It’s like having a spellchecker only for three-letter words.
- Social Network Analysis in R — video and slides for talk on doing social network analysis with R.
- We’re Keeping the Term Extraction Service — Yahoo!’s useful API gets a stay of execution. OK, we heard you. You’ve made it clear to us that shutting down the Term Extraction Service would be a mistake. So, we’ve changed our plans. We’re leaving the service up and running indefinitely. (via Simon Willison)
A collection inspired by Science Foo Camp attendees
- Endogenous steroids and financial risk taking on a London trading floor (PNAS) — We found that a trader’s morning testosterone level predicts his day’s profitability. We also found that a trader’s cortisol rises with both the variance of his trading results and the volatility of the market. Our results suggest that higher testosterone may contribute to economic return, whereas cortisol is increased by risk. Our results point to a further possibility: testosterone and cortisol are known to have cognitive and behavioral effects, so if the acutely elevated steroids we observed were to persist or increase as volatility rises, they may shift risk preferences and even affect a trader’s ability to engage in rational choice.
- The Origin of Universal Scaling Laws in Biology — eye-opening paper that blew my mind. Highlight of Sci Foo was meeting the author and shaking his hand. Relates metabolic rate, size, heart rate, and lifespan by applying physics to biology.
- Ushahidi — open source software for managing disasters. The Ushahidi Engine is a platform that allows anyone to gather distributed data via SMS, email or web and visualize it on a map or timeline. Our goal is to create the simplest way of aggregating information from the public for use in crisis response.
- Dissecting the Canon: Visual Subject Co-Popularity Networks in Art Research — In this paper we analyze a classic da-
taset of art research, which collects ancient art and architecture and their Western
Renaissance documentation since 1947. [T]here is clearly a long tail of monument