Case Study: Twitter Usage at Wordcamp SF

usage chart

One of my many hats is as an events organizer. Twitter has become an invaluable tool for me to gauge the mood of the attendees. Are they excited by the current speaker? Bored or excited at the latest news? Are they having a good time? And most important, are they making connections?

Pathable, an events social networking company, has posted an analysis on the use of Twitter at WordCamp SF. The above chart shows how 797 tweets were categorized by a Pathable intern. Disclosure: I am friends with the co-founders of Pathable and a proud advisor of the company.

Or as Pathable more broadly classifies them:

  • Tweets that are not directly relevant to the vast majority of event attendees (”Here’s what I’m doing / feeling”, “talking directly to someone else”) make up about 1/3 of the tweets sent.
  • Tweets that are useful to people who can’t physically be at the event (”Comments / Quotes about speakers”, “Announcements / Info / Questions related to event”) make up more than 1/3 of the tweets
  • Tweets that report people’s intended or actual location make up around 1/6 of the tweets (”Traveling to”, “At the event / session”)

And who do you think send those tweets?

While 258 total people sent at least one tweet, 20 people account for more than half of those. That’s consistent at a high-level with the “long-tail” notion of user-generated content (i.e., a large number of people contribute small amounts of content, but that content in aggregate accounts for a large proportion of the total content). The numbers, however, don’t fit cleanly in the 80/20 90/10 buckets that are often cited. Instead, it’s more like 50/50 (50% of the content is accounted for by a small number of high activity contributors, 50% by everybody else).

twitter events

What I find really interesting is the flow of Tweets before, during and after the event (shown above; colors do not correspond to the pie chart). I like seeing that slow build up to the event and the huge spike during.The large red band are tweets classified as “Here’s what I’m doing”, the blue band at the bottom are tweets directly related to the event, and the pea green are conversations. It seems like you’re lucky if there’s much discussion after the event.

This is great after the fact analysis and it got me thinking about what I would want in realtime. In addition to a hashtag search of the realtime tweets, I want a dashboard that shows me the state of the community. The community in this case is self-selected; it’s the people using event tags or interacting with the event Twitter identity. I’d want the following metrics

  • Community Pulse – What’s the mood of the attendees? Negative or positive? What’s the tag cloud look like?
  • Community Connectedness – How many retweets are there? How many people are following each other? Is that number growing over the course of the event?
  • Engagement – What percentage of tweets being sent out by the community are using the tag?
  • Growth – Are more people using the tag? How many new users are we gaining/losing per hour?
  • Influencers – Who are the most connected tweeters in the group?
  • Locations – Where do people claim they are? Or more likely, are from?
  • One of the challenges is to agree on a hashtag before the event. If the organizers broadly publish a hastag well before the event, it seems to eliminate the need to follow different streams of conversation. I have seen some events where there were up to 5 or more groups of people using different hashtags for each group. Sometimes as the event progresses the stream begin to consolidate, but it is fascinating to watch.

  • Thanks for an interesting detailed look into tweets during a specific event. It’s pretty fascinating that the aggregate of tweets split between low tweeters and high. Perhaps it points to the fact the Twitter is still a fairly new mode of communication and thus still a lot of early adopters?

    In addition to your metrics wish list above, I’d want to know if the “influencers” were predominantly more responsible for the 1/3 of tweets directly related to the event, along with how many followers the attendees had before and after the event. Both might be good indicators to help determine if Twitter successfully carried the event message/ key takeaways beyond attendees and into the border Twitterverse.

  • Prakriti

    What was that about 50/50?

    80/20 means 80% of the traffic generated by about 20% of the users. NOT 80% of the traffic is generated by a small number of users and 20% generated by all the rest (though that follows, obviously).

    You need to specify what percentage of total users send 50% of the tweets.

  • Elections in Iran are the demonstration of one of the specific utilities of Twitter.

    Julián Chappa

  • Mia

    Something like the ‘Community Pulse’ was developed at the dev8D event – ‘Whenever anything is Tweeted using the dev8D tag, if a fraction is included to indicate happiness (such as 9/10), it gets added to the dev8D Happier Pipe and the total sum of happiness at dev8d can be seen at a glance’.

    More info at and

    It was quite fun and I assume quite useful for the event organisers, but also meant that the event stream was littered with happiness ratings. Using a variation on the event hash tag for ‘mood ratings’ would let you capture the mood without cluttering up the general stream.

  • Interesting. Learning more about using Twitter and Tweets effectively. Mahalo!


  • Interesting. Learning more about using Twitter and Tweets effectively. Mahalo!


  • Amy

    Cool, reminds me of our work on tweet natural-language-processing for the SXSW Zeitgeist and Internet Week Zeitgeist.

    I wrote an essay on how we made those work:

  • You’re missing the attendees who found Twitter banal and avoid using it. Can you profile that?


  • bowerbird


    excellent point. #twitterisbanal.


  • Good analysis. One small correction:

    The overall distribution still follows expected 80/20 Principle patterns. Most people forget that the 80/20 “Rule” is recursive, and that it also doesn’t have to always match the 80/20 AVERAGE exactly, only that there is a LARGE imbalance away from 50/50.

    If you take 80% of 80%, and 20% of 20%, you get roughly 64/4, and then 51/1. So it is quite fitting that a small minority, say 1% of contributors, creates around half or more of the content, while the other 99% create the other half.

    But it is misleading on the basis of 80/20 formulation to call this 50/50, b/c the Pareto numbers are always about effects and their causes. Comparing 50% of content (effects) to 50% of the other content (more effects), is not following the usage pattern.

    And the data will still show that around 80% of the content (could be ranging anywhere from about 65 to 95), is created by 20% of the users. The expected sharp imbalance is still intact.