Github: Making Code More Social

github logo

Github launched less than a year ago, but it’s already making an impact on how open-source software is being created. Rails was there from day one, kick-starting the social software repository’s traffic. It has taken off though it still doesn’t compare to Sourceforge’s traffic.

Github combines “standard” features of social networking sites with distributed source-control Git. You can follow or message a person, you can watch or fork projects and activity streams share your behaviors. Users are able to easily fork projects and create their own versions that can then be merged back to the original or take on a life of their own. Leaderboards help you find

There are 33 languages formally listed on github. The service is dominated by Ruby (36%) and Javascript (24%) with 31 other languages rounding out the rest.

Github has a free plan that lets you create your own public repositories, work on them with public collaborators and have 100MB of disk space. The rest of the plans up the disk space and provide the ability to have private repositories and collaborators. The more you pay the more ACLs you get. It’s a great example of launching with a business model from the get-go. The appeal for a company is easy to see. Why manage their own source control? Especially if they will ever be open-sourcing any of the code.

One way to really see the impact of github is on Rails. When the project began it was just DHH contributing. Slowly it grew to include the core team, but when it moved to Github there was a boom. At the 5:05 mark there is almost an explosion of new committers. This will only increase with Rails 3 and the merger with merb (which is on github along with an oft-forked merb book).

Ruby on Rails from Ilya Grigorik on Vimeo.

Before and After Rails Moved to Github:

github viz b4github viz after

Github was founded by 4 developers, one of whom impressively left Powerset during the Microsoft’s acquisition to go fulltime with Github.

There is also a visualization of all the code sharing and forking on Github itself (and Python and Apache – though they are not on github). These videos were created using code_swarm.

I expect github will be talked about a lot at the Web 2.0 Expo this year.

tags: , , ,
  • What is the point of “distributed, decentralized revision control”?

    Every service provided by github should be and could be better provided by a peering network akin to usenet – a structured, write-once, host-independent database with a P2P replication backbone.

    There should be no non-rival benefit from being the recognized “host” of such a hub. Github is a regression, back towards SourceForge. Ideally, it exists now but for a brief time and mostly to educate people to get past it.


  • Also, the videos of the “visualizations” would be far less pernicious if they lost the emotionally manipulative sound track.


  • Thomas: Many people have this reaction at first, and that’s understandable. Remember that Git is not a centralized version control system: your GitHub repository is just a mirror, a node in a graph. Many people host their repositories at multiple sites. In fact, we encourage this.

    GitHub provides visualization and social tools to help your workflow but by no means purports to be a centralized host. Every time you clone a Git repository, guess what – now you’re a host, too.

    If your repository has a 10 year history, your first push to GitHub will provide even more value than the first push of a new project. You’ll be able to visualize history, comment on commits, fork and modify code, and do all sorts of cool things as if the repository had always been there. The fact that GitHub runs a git-daemon is a very small part of the picture.

    Try it out. You can always take your repository (and its full history) with you if you’re not happy :)

  • Chris:

    So, you are tying a theoretically non-rival resource (git files) to a bunch of needlessly rival resources you happen to own and then calling that progress.

    Good for you!

    That’s all I’m saying, and you agree, according to you.


  • @Chris don’t feed the trolls… :)

    Thanks for an awesome service, which provides A LOT of value. We’re doing tons of decentralized development (right now, contributors to our closed-source stuff are on three continents), and github makes communication and development so much easier. And thanks for all the improvements all the time. You guys rule.

  • Thomas: So what you’re saying is, we should all be using your non-existent theoretical version control system which will also cure aids?

    Github is the killer app that got me to switch to git, and I cringe whenever I have to go back to svn. Because the code is hosted the barrier to exploring & learning from others code is just a few mouse clicks. It exists right now, and helps me be more productive than I ever was before, as well as inspires me to push myself to improve in ways I probably never would have otherwise.

    This might seem like a weird thing say about a version control system/hosting site, but I think you will find a lot of users over there are getting very real value out of something that exists right now, and for that we are grateful.

    Github is only going up, and I’m glad to be along for the ride!

  • Thomas: So what you’re saying is, we should all be using your non-existent theoretical version control system which will also cure aids?

    No, no. Git is just fine for this. The novel visualizations in github look swell. The collectivization of effort (github’s success at adoption) is maybe even the real story: that’s some impressive organizing, as far as it goes.

    My complaint is about the clash in architectures of two sets of the software components there. Git, on the one hand, implements a distributed, decentralized database with a powerful form of ad hoc replication. The rest of the github software implements a local, centralized database creating an artificial (i.e., technically needless) monopoly on administrative powers.

    There isn’t any deep reason (other than habit, by this time) that the rest of the github software has to be built using that centralized architecture. It’s database could be made “distributed and decentralized” as well.

    By “tying” the centralized github to the decentralized git, in effect for all of those large number of people subscribed and using it you’ve made git less decentralized, for all practical purposes. That’s why I call it a “regression”.


  • Here’s my POV as a satisfied GitHub customer. (Now using github at No Starch Press, to host copies of book chapters for authors and tech reviewers. We’re not using many of the fun features of git yet, but it does make me happy that everyone has a copy of everything because I don’t trust companies, other people, my own hardware, or my own clumsy fingers and weak git skills.)

    If I wanted to set up a server to run gitosis, I would have to get approval to spend the money on the server (or put in the time to kludge one together from scrap PCs), get an external IP address, and administer the thing.

    Instead, Engine Yard sets up servers and storage by the pallet-load, github pays Engine Yard and administers git, and I only have to fill out a web form and print out a receipt every month to staple to my expense report.

    So far, sounds like your typical Web 2.0 deal, but the difference is that if github turns Eeeevil, all I have to do is build that gitosis server, change refs/remotes, mail my project collaborators to get their ssh keys, and keep going. I never have to deal with one of those “I’m going to scrape your Web 2.0 site/No you’re not, read the ToS” beefs. Git enables collaboration sites to comply with the Franklin Street Statement.

  • So, Don, I think your saying that the most important data asset on github is the source code repositories themselves which are, as ever, distributed and decentralized (at least in principle). And the Github feature set is implemented by non-rival software so you could recreate it somewhere else. And, if you *had* to recreate it somewhere else and the github admins were hostile, at most you’d lose some database content around the periphery but the source would be intact and the data you lost wouldn’t matter that much (e.g., people could re-file bugs or re-start threads of discussion or whatever on a new site). Meanwhile, github (as an institution) streamlines the cost structure (mainly by scaling) and, as a centralized entity, can help to find subsidies for some of that cost structure.

    That’s how I take you and I would say that, yeah, I can believe all of that and that sounds about right.

    You could say “If github falls down, it’s not problem, we can scramble to build a new one with only minimal disruption, for some definition of minimal.”

    I am saying: if you didn’t have a centralized dependency there, the question wouldn’t arise. There’d be no singular hub to fall down causing a need to scramble and suffer “minimal” losses. If some host “fell down” in a crisis, you’d have a much better chance of being robust against that – of not having to “scramble” much at all. And the way to do that, from a bird’s eye perspective, is to harmonize the architecture so that the other features of github are like git itself – based on a decentralized, distributed database featuring ad hoc replication.

    In general, in conversations like this: That someone (like me) has some true criticisms to lay upon some work does not in any way diminish the authentic value of the work. The true criticisms are true whether they are spoken or not. This whole exchange is not an “attack” on github.


  • Leaderboards help you find… what?

  • Github rocks, is growing, it is great, I love it, I like it, and I want them to grow more, they offer a great service.

    opensource was a great break-through and more specially after it started to be popular on a global scale.

    That was a huge wave of techological-revolution.

    And many do not notice but it has transformed economy and the life for the most part on a global scale as well.

    and now git & github are extending this and the growth of projects is exponential!

    Because of this they are leading the leading technology to come in the next couple of years.

    Keep going guys!