"Twitter" entries

Tweets loud and quiet

Twitter’s long, long, long tail suggests the service is less democratic than it seems.

Writers who cover Twitter find the grandiose irresistible: nearly every article about the service’s IPO this fall mentioned the heroes of the Arab Spring who toppled dictators with 140-character stabs, or the size of Lady Gaga’s readership, which is larger than the population of Argentina.

But the bulk of the service is decidedly smaller-scale–a low murmur with an occasional celebrity shouting on top of it. In comparative terms, almost nobody on Twitter is somebody: the median Twitter account has a single follower. Among the much smaller subset of accounts that have posted in the last 30 days, the median account has just 61 followers. If you’ve got a thousand followers, you’re at the 96th percentile of active Twitter users. (I write “active users” to refer to publicly-viewable accounts that have posted at least once in the last 30 days; Twitter uses a more generous definition of that term, including anyone who has logged into the service.)

You're a bigger deal on Twitter than you think

This is a histogram of Twitter accounts by number of followers. Only accounts that have posted in the last 30 days are included. Read more…

Comments: 15

The birdie and the shark

Twitter isn't quite beyond jumping the shark, but it has taken a big step backward.

While I’ve been skeptical of Twitter’s direction ever since they decided they no longer cared about the developer ecosystem they created, I have to admit that I was impressed by the speed at which they rolled back an unfortunate change to their “blocking” feature. Yesterday afternoon, Twitter announced that when you block a user, that user would not be unsubscribed to your tweets. And sometime last night, they reversed that change.

I admit, I was surprised by the immediate outraged response to the change, which was immediately visible on my Twitter feed. I don’t block many people on Twitter — mostly spammers, and I don’t think spammers are interested in reading my tweets, anyway. So, my first reaction was that it wasn’t a big deal. But as I read the comments, I realized that it was a big deal: people complaining of online harassment, trolls driving away their followers, and more.

So yes, this was a big deal. And I’m very glad that Twitter has set things right. In the past years, Twitter has seemed to me to be jumping the shark in small steps, rather than a single big leap. If you think about it, this is how it always happens. You don’t suddenly wake up and find you’ve become the evil empire; it’s a death of a thousand cuts. Read more…

Comments: 2

Podcast: news that reaches beyond the screen

Finding ways to make media interact with the physical world

Reporters, editors and designers are looking for new ways to interact with readers and with the physical world–drawing data in through sensors and expressing it through new immersive formats.

In this episode of the Radar podcast, recorded at News Foo Camp in Phoenix on November 10, Jenn and I talk with three people who are working on new modes of interaction:

Along the way:

For more on the intersection of software and the physical world, be sure to check out Solid, O’Reilly’s new conference program about the collision of real and virtual.

Subscribe to the O’Reilly Radar Podcast through iTunesSoundCloud, or directly through our podcast’s RSS feed.


Twitter’s Most Fundamental Value

Twitter could be so much better than an advertising company

We can now gather from Twitter’s IPO that it’s fundamentally postured as an advertising company, but its real value isn’t in advertising. Twitter’s most fundamental value rests squarely within data analytics. However, just because Twitter could make a lot of money in advertising doesn’t mean that advertising is where it should concentrate the majority of your efforts or where its most fundamental value proposition lies.

More specifically, Twitter’s most fundamental value is in the overall collective intelligence of its user base when interpreted as an interest graph. Think of an interest graph as a mapping of people to their interests. In other words, if you follow an account on Twitter, what you’re really saying is that you’re interested in that account. Even though there’s lots to be gleaned in all of the little 140 character quips associated with a particular account, there’s a good bit you can tell about a person by solely examining the accounts that the person follows.

Read more…


Investigating the Twitter Interest Graph

Why Is Twitter All the Rage?

I’m presenting a short webcast entitled Why Twitter Is All the Rage: A Data Miner’s Perspective that is loosely adapted from material that appears early in Mining the Social Web (2nd Ed). I wanted to share out the content that inspired the topic. The remainder of this post is a slightly abridged reproduction of a section that appears early in Chapter 1. If you enjoy it, you can download all of Chapter 1 as a free PDF to learn more about mining Twitter data.
Read more…


Writing Paranoid Code

Computing Twitter Influence, Part 2

In the previous post of this series, we aspired to compute the influence of a Twitter account and explored some relevant variables to arriving at a base metric. This post continues the conversation by presenting some sample code for making “reliable” requests to Twitter’s API to facilitate the data collection process.

Given a Twitter screen name, it’s (theoretically) quite simple to get all of the account profiles that follow the screen name. Perhaps the most economical route is to use the GET /followers/ids API to request all of the follower IDs in batches of 5,000 per response, followed by the GET /users/lookup API to retrieve full account profiles for up to Y of those IDs in batches of 100 per response. Thus, if an account has X followers, you’d need to anticipate making ceiling(X/5000) API calls to GET /followers/ids and ceiling(X/100) API calls toGET /users/lookup. Although most Twitter accounts may not have enough followers that the total number of requests to each API resource presents rate-limiting problems, you can rest assured that the most popular accounts will trigger rate-limiting enforcements that manifest as an HTTP error in RESTful APIs.

Read more…

Four short links: 3 October 2013

Four short links: 3 October 2013

USB in Cars, Capture Presentations, Amazon Redshift, and Polytweeting

  1. Hyundia Replacing Cigarette Lighters with USB Ports (Quartz) — sign of the times. (via Julie Starr)
  2. Freeseerfree, open source, cross-platform application that captures or streams your desktop—designed for capturing presentations. Would you like freedom with your screencast?
  3. Amazon Redshift: What You Need to Know — good write-up of experience using Amazon’s column database.
  4. GroupTweetAllow any number of contributors to Tweet from a group account safely and securely. (via Jenny Magiera)
Comment: 1

Computing Twitter Influence, Part 1: Arriving at a Base Metric

The subtle variables affecting a base metric

This post introduces a series that explores the problem of approximating a Twitter account’s influence. With the ubiquity of social media and its effects on everything from how we shop to how we vote at the polls, it’s critical that we be able to employ reasonably accurate and well-understood measurements for approximating influence from social media signals.

Unlike social networks such as LinkedIn and Facebook in which connections between entities are symmetric and typically correspond to a real world connection, Twitter’s underlying data model is fundamentally predicated upon asymmetric following relationships. Another way of thinking about a following relationship is to consider that it’s little more than a subscription to a feed about some content of interest. In other words, when you follow another Twitter user, you are expressing interest in that other user and are opting-in to whatever content it would like to place in your home timeline. As such, Twitter’s underlying network structure can be interpreted as an interest graph and mined for insights about the relative popularity of one user when compared to another.
Read more…


Pattern-detection and Twitter’s Streaming API

In some key use cases a random sample of tweets can capture important patterns and trends

Researchers and companies who need social media data frequently turn to Twitter’s API to access a random sample of tweets. Those who can afford to pay (or have been granted access) use the more comprehensive feed (the firehose) available through a group of certified data resellers. Does the random sample of tweets allow you to capture important patterns and trends? I recently came across two papers that shed light on this question.

Systematic comparison of the Streaming API and the Firehose
A recent paper from ASU and CMU compared data from the streaming API and the firehose, and found mixed results. Let me highlight two cases addressed in the paper: identifying popular hashtags and influential users.

Of interest to many users is the list of top hashtags. Can one identify the “top n” hastags using data made available throughthe streaming API? The graph below is a comparison of the streaming API to the firehose: n (as in “top n” hashtags) vs. correlation (Kendall’s Tau). The researchers found that the streaming API provides a good list of hashtags when n is large, but is misleading for small n.

streaming api vs firehose

Read more…