Jun 18

Tim O'Reilly

Tim O'Reilly

The Rise of Open Source Java

Last year at OScon, I gave a presentation entitled What Book Sales Tell Us About the State of the Tech Industry. One of the conclusions I drew was that Java was in decline, as its share of total programming language book sales had dropped by five percentage points in the twelve months ending June 2004. Well, we just re-ran those numbers, and saw a startling reversal.

Here's the updated trend graph showing programming language market share as reflected by weekly book sales reported by Nielsen Bookscan from January 2003 to mid-June 2005:


Note: ".Net languages" includes books that cover any of the .Net supported languages (C/C++, C# and Visual Basic). "Other languages" includes ActionScript, shell scripting, and Lisp. Note also that percentages in this graph do not equal those in the 2004 graph, because that graph did not include JavaScript, ActionScript, or shell scripting (other languages), while this one does. A larger version of the graph can be found here.

You can see that there was a sharp uptick in Java book sales starting in July of 2004 -- Java's share of all programming books sold is up about 3% since June of '04. A lot of this growth spurt occurred shortly after JavaOne and the new Tiger release, which happened around that time. All of the top titles were revised, and saw a healthy sales increase as a result. However, when we analyzed new books (versus revisions), it appears that a substantial portion of Java's sustained growth, outside of the classic titles, has come from books on Open Source Java projects, such as Spring, Struts, Lucene, and AspectJ, which collectively performed at nearly double the unit and revenue volumes of new books on their non-Open Source counterparts.

These results indicate that a lot is happening in the Open Source Java community, at least on the book side. To support this positive trend, we've devoted a whole track to Open Java at this year's OSCON, to be held August 1-5 in Portland, OR. And of course, we're continuing our usual strong coverage of PHP, Perl, Python, and Ruby.

Back to the graph, you can see that sales of books on C# have leveled off, while books on C/C++ have seen something of a resurgence along with Java. PHP book sales have also leveled off, while Python has continued to gain ground against Perl, and we're perhaps seeing the beginning of an uptick in Javascript book sales, driven no doubt by interest in AJAX. I'll be giving full details of our research in my technology trend rundown at OSCON, but I thought I'd give a few details on this particular bit of news ahead of time.

By the way, there's still time to sign up -- early registration ends on the 20th. Hope to see you there.

tags:   | comments: 34   | Sphere It

Previous  |  Next

0 TrackBacks

TrackBack URL for this entry:

Comments: 34

  Justin Watt [06.18.05 11:11 AM]

Tim, first a question. Do those numbers reflect only O'Reilly's titles, or the whole tech book market?

Secondly a comment: assuming it was just O'Reilly's titles, I would find it interesting to see a similar graph where the total number of books sales in a category are divided by the total number of book's available for sale in a category. I think this might be a better indicator of the strength or popularity of a given language.

  k [06.18.05 03:16 PM]

Could you please post a full sized, real image of the stats. Flash is a non-standard, non-portable, propietary[*] format and not everyone can diplay it. Just say no to Flickr.

[*] Yes, I know Macromierda suposedly published the specs, it still sucks, and still thank God no one has reimplemented it.

  Kevin Farnham [06.18.05 04:04 PM]

This graph is incredibly interesting. However, I find it hard to distinguish the colors and match the key to the lines in some cases. Is a text (or .csv) file containing the data available?

  Tim O'Reilly [06.18.05 04:31 PM]

Justin -- This is for all books, not just O'Reilly books. As reported previously on this blog, the data comes from Neilsen Bookscan, which reports point of sale data from most major bookstores in the US, including Amazon, Barnes & Noble, Borders, and leading independents. Collectively, it represents actual sell-through data for approximately 70% of the US domestic market. So these percentages represent share of all programming books sold.

k, and Kevin -- the full-size image is here. (I also added a link to the larger image in the original article above - sorry I didn't think of this earlier.) For any images I post, you can also just go directly to my photostream at Flickr. (Go to Flickr, choose People from the menu at the top, and search for my name. You can also look for the O'Reilly Radar photo group.) I do agree that Flickr's URLs are a bit opaque, but they have created so many other kinds of openness that I have to disagree that use is contraindicated. Ditto for Flash.

In addition, if you're having trouble matching up the lines with the legend, note that the captions in the legend are in the same order as the lines, from the top, so even if you can't match the colors, you should be able to count down, and where lines cross or are very close, the color should help disambiguate, even if it's a little difficult. I understand that there's a lot of data in this graph -- fewer lines would probably help. But I'm on the road, and don't have time to generate a better one.

  Stephen Downes [06.18.05 06:13 PM]

When you consider that Java books in book stores outnumber, say, Perl book by margins greater than 10 to 1, when you consider that schools teach almost nothing but Java (and .Net), when you consider that Java is mandated in many university and corporate environments, it seems to me that remaining static at about 25 percent in fact represents a substantial pushback. Despite every advantage, Java cannot gain on what are essentially unsupported, unauthorized and underground scripting languages.

And btw, the image is too small; I can't look at it comfortably. Clicking takes me to Flickr, where it appears no larger. Maybe there's a big one behind the secret passage ("Go to Flickr, choose People from the menu at the top, and search for my name...") but the secret passage is too convoluted to spend time on (esp. when I've already been to Flickr and it was still too small).

Tiny images are bad for people with bad eyes like me. Flickr doesn't help me there, and Flash actually makes things worse. IMHO you should just link to a big copy of the image, and leave Flash and Flickr to their own devices.

  Rich Sharples [06.18.05 06:28 PM]

Flickr stopped using flash for 'public' pages about a month ago - AFAIK the only piece still crippled by flash is the organizr (sic) which you only see if you are posting pictures. The regular pages now use DHTML.

More here

  Jeremy Dunck [06.18.05 09:56 PM]

The larger image is here

  Jeremy Dunck [06.18.05 10:33 PM]

Urg, sorry, that was the image from 2004.

Here's the 2005 one.

  Damien B [06.18.05 11:23 PM]

You need to be logged in to see the bigger Flickr image. Visitors who don't want to register to Flickr are restricted to the lilliputian image.

  Greg Rollins [06.18.05 11:56 PM]

Tim, do these numbers reflect Safari?
I've only bought 2 books in the past 7 months because of Safari. I pretty much just add to my library whatever I need.

Answer from Tim: No, these numbers reflect point of sale data from print book sales only. We have begun an analysis of the relationship between Safari sales and print book sales to see if there is any consequent decline in print when books become available online. We haven't yet completed that analysis. However, the availability of online books is an unlikely factor in Java's decline because the chart shows relative percentages of all programming language book sales. So any decline would affect all languages similarly. (There might be a skew relative to C#, since not all the Microsoft Press books, and none of the Wrox or APress books, are in Safari. But for most languages, O'Reilly and Pearson, collectively, are so dominant -- and so all of the dominant language books are in Safari -- that any effect should be minimal.)

In addition, we have some evidence from a macro perspective that supports the idea that Safari does not have a negative impact on print book sales. And that is the fact that O'Reilly and Pearson, the two publishers most heavily represented in Safari, are also the only two major publishers to have seen market share gains in print book sales (per Bookscan) during the three years for which we have data. That being said, other publishers have books available in competing online services. And some of those services give away online access to books as a loss-leader for their online training business, so it could well be that there is a decline that is masked by offsetting factors.

  Simon Willison [06.19.05 03:51 AM]

Here's the big image on Flickr without you needing to log in.

  Tom [06.19.05 09:30 AM]

Doesn't this suggest that Java has maintained or increased in complexity, thereby requiring one or more books on a variety of topics for one to do anything useful? And that the Java community has failed to provide adequate documentation and support?

At this point, I'm finding that it is impossible to be productive in Java without several books. As oposed to working with Ruby, where Dave Thomas' bookis pretty much all needs (unless you go the Rails route, then you'll be lost without one of the half dozen books due this summer. )

  Remus Pereni [06.19.05 12:58 PM]

I don’t think Java per se became more complex. It’s just that these days we are used to a certain type of development, code reuse and way to much functionality plus little development time. We want logging, transparent database connectivity, transactions, configuration, modularity, web services, scripting integration and more importantly smart and easily maintainable code. As somebody (I can’t recall exactly who) put it, we, Java developers, are architecture and design freaks, we impose a standard of overly complicating things and inherently our life, we have to think every thing at an enterprise level, even if it is not the case.

If the specific target is complex and requires all that complexity we are in pole position, we have everything already prepared, which of course is great, why try to reinvent the wheel and make all the inherent mistakes when somebody already did that before us and shared the results. In this case you will need the books for Spring, RCP, Hibernate, ... but consider the time you would need to reinvent all those.

Must of the time the real target is not that complex as we might think.

  Doug [06.19.05 04:57 PM]

The only thing significant I see going on is the decline of VB books, and the blips in Java like noted because of and updated Java version and updated/new java books. The other blips in other languages are probably explained similarly - updated or new books released, school starting or ending, etc.
The graph is barely readable. I count only 10 lines, but 11 series in the legend.

And even if the data counts Amazon, Barnes & Noble, and other leading independent book sellers, probably a lot of people, particularly in .NET get their books directly from the publisher, or used, or at Microsoft and other techie conventions. Also .NET hasn't changed much the past couple of years (which is what this graph covers), and Microsoft has much more documentation and samples online than Sun.

Answer from Tim: The 11th line on the graph is for Ruby, and it's so small relative to the others that it doesn't show up, except as a thickening of the bottom line of the chart.

As to lots of people getting their books direct from the publisher being an explanation of the changes -- it would seem to me that that would affect all languages equally. In any event, it is in fact a tiny fraction of all books sold.

  Jose Luis Hurtado [06.19.05 10:03 PM]

Dear Tim,

Great graph, thanks for sharing this with the community.
Could you post a higher resolution graph?

Regarding Java, I think open source Java will happen, it can not be stopped, and it will bring life back to the language and the whole community.

One think Java needs though is a serious alternative for RIA, a.k.a. Avalon XML, Flash, something simple and powerful, tag based... let's hope it happens someday : )

I am very interested on the details of the lower part of the graph, particularly on Ruby and Python.


Jose Luis Hurtado

  J.T. Wenting [06.20.05 01:28 AM]

Could it be that the temporary slump in the sales of Java books were caused by people anticipating the new books (and editions) with coverage of Tiger?
I know I deferred some purchases until after Tiger was released, I'd not be surprised if many others did the same.

Answer from Tim: I don't think so. First off, the slump was year-long. We usually see a slowdown of only a few months before a new software release. That being said, it had been a long time since a new release, and that is certainly a factor in book sales. However, contrast Perl -- also a long time since a new release, with a long flat line, rather than the kind of steep decline we saw in Java during that period.

  Gary [06.20.05 11:06 AM]

It looks like percentages book sales are flat, except for a decline in VB.

It would be more helpful to see absolute sales numbers instead of percentages. Are the percentages normalized to one single value, to a month-by-month total, or a year-by-year total? These things make a graph of percentages much less useful.

Answer from Tim: Actually, absolute sales numbers are much harder to parse, as there is pronounced seasonality in the market, as well as a significant overall decline in the computer book market over the three year period shown here. That's why we chose percentages. These numbers show the relative percentage of all computer books sold (or at least all reported by Bookscan). The percentages are normalized week by week throughout the period shown.

  Tim O'Reilly [06.20.05 03:20 PM]

Based on some of the questions here, I see that it might be useful to point people to a previous posting, Book Sales as a Technology Trend Indicator, which explains the data source and the methodology.

  David [06.21.05 12:56 PM]

The thing about Java is there is an aweful lot of free quality online documentation and examples. It's very easy to "google" and find a lot of useful examples in a matter of minutes.

  Tim O'Reilly [06.21.05 01:36 PM]

David -- there is a lot of free online documentation for other languages as well. What's more, this documentation has been available online for the entire period covered by the graphs, so I don't think that this is a factor in the changes.

  dadij [06.29.05 01:14 PM]

The TIOBE Programming Community index gives an indication of the popularity of programming languages. The index is updated once a month and the numbers go all the way back to 2001. It is interesting to compare it to the book sales numbers. C is number one on that list (June 2005), with Java coming next and Perl in third place. Our beloved Javascript is in 11th place just a shade above... yes, you guessed it: COBOL! Lots of food for thought when you compare the two indexes!

  grennis [06.29.05 01:28 PM]

I'm confused. You have two categories called C# and ".NET Languages". Well, C# obviously IS a .net language. Why do you have them split like this?

  Tim O'Reilly [06.29.05 02:28 PM]

Grennis, see the note directly under the figure. Many .Net books cover multiple languages. Rather than disaggregating them into the various languages, leading to double-counting, we kept them as a separate category. So in theory you could add those numbers to either VB or C# or C/C++ or all of the above. But we don't know what language the purchasers of those books were targeting, so we left that info in a category of its own.

  Mark [07.18.05 02:33 PM]

Question for Tim: Where does ColdFusion sit within the graph? Is it considered part of Java? If not, where do you see CF in the future?


  Ekaterina [08.02.05 07:56 AM]

Dear Mr O'Reily,
could you possibly advise me which sources did you you for detailed statistics on book sales? I am looking for data on book sales in diffrent subjects to estimate their popularity, but am a bit at a loss at to where to start from.
I will be very grateful for any help.
thank you very much in advance,
Sincerely yours,

  Scott Ellsworth [08.10.05 04:19 PM]

I just looked at the tech books I bought since my Safari subscription expired at the end of December:

Ruby and Ruby on Rails books: 2
Subversion: 1

I then looked at my internal development hours over the same seven months. Days spent prepping new technologies:

Java: 21
Ruby: 2
Subversion: 2
Perl: 1
Cocoa: 2
MacOS X: 3

The number of books does not accurately represent either wmhat is making me money, nor does it reflect the far more expensive time I am spending. I believe this is for two reasons: the depth of references I already had, and the maturity of the information sources. The first will vary by person, but the second will vary by the state of technology.

I have a shelf full of Java books. I also know which people in the community I can learn from quickly, so I go straight to their web sites.

In new technologies, like Ruby, or subversion, I did not have that pile of existing information, nor did I have a short list of the best web-based sources of help. I had to go the dead tree route, until I knew enough to ask good questions.

I expect to pick up Hibernate, a Java technology, in the next couple weeks. I may buy the Manning book, but I have used technologies like it enough that I may not need to. It is very likely that I can get up to speed by simply pushing a toy project through, and running some benchmarks. This is not true for Rails - the new book looks like it will speed my acquisition of it greatly.

Even thought Java produces most of my revenue, I do not show up as a big consumer of Java books, simply because I already have ready access to so much information already.

I considered whether this was a statement about my own knowledge, as opposed to a general trend, and decided that it was not. The books on my shelf represent my own state of knowledge, but my ability to find good reference works on the web is a statement about what information I can find. Simply put, Java, and Perl, are both mature enough that I can find top quality references. Even the new stuff in Java 5 is likely to be written about on sites that are already known to be good for Java. Ruby, on the other hand, is still building that knowledge base.


  Tim O'Reilly [08.11.05 08:55 AM]

Scott --

Very interesting post! On many fronts.

I think you're right that the availability of information on the web competes with books, and that people may buy fewer books as the web content matures. Or at least, buy different books. For example, one trend we've noticed is that (contrary to what one might assume without thinking hard about it), advanced books tend to sell best early in a technology's lifecycle. More introductory books become bestsellers as a technology matures.

Why? Early adopters are tech savvy, learn quickly, and already have a large knowledge-base.

Similarly, as I just reported in my O'Reilly Research tech trends talk at OScon (link warning: large PDF download of slides), job postings go up as book sales go down.

In short, book sales are a leading indicator of what is on the developer radar -- what people want to learn about.

But back to your main point: the availability of online information is indeed a huge downward driver of the technology market (and a huge concern for us technology publishers.) Google is our biggest competitor, and we're trying to learn how to make it our biggest channel instead -- hence our interest in Safari and things like Google Print. (Incidentally, why did you drop your Safari subscription? We need to make it a must-have service, and we need feedback of people who dropped it to tell us just what they were missing.)

  william Householder [10.12.05 12:49 PM]

May I use your graph on an english paper. About the rise of use in java base technology.

  Gabriel [02.16.06 09:59 AM]

Java is an widely used technology who has been around for over a decade now. Serious businesses (let's say enterprise level) are using Java. Java knowledge is widely available on the internet unlike .NET which is fairly new (5 years). That's why Microsoft technologies are selling more books. Like anything else Microsoft is expensive. You can develop J2EE application without spending a penny on Application Servers(JBoss), IDEs (Eclipse), Frameworks (Open Source i.e. Struts ...), documentation (widely available on the internet). To say what Linus T. said once... I am comparing sience with witchcraft.
Let's not forget the marketing effort at Microsoft which are pushing "whichcraft" technologies on the market.

  Anonymous [03.30.06 09:25 PM]

good to read and good to get in mind

  mareal [06.25.06 04:01 AM]

i'm just curios are how many programming languages exist today? Anybody can answer me?

  Bashar [07.22.06 01:10 PM]

This is a very intersting subject. Sometimes we should forget about the fact that from where this came from and focus on how we can use it.

  L505 [12.10.07 12:05 PM]

Books should be sold in open binders where we can remove the pages and copy them freely. Hardback and paperback books are closed binary-like software wear we must pay hefty licensing/rental fees to own the book just like Microsoft software.

Oreilly must open the books and let people freely copy them, even it Oreilly goes broke by doing so. Sorry.

Post A Comment:

 (please be patient, comments may take awhile to post)

Type the characters you see in the picture above.