May 16

Mike Hendrickson

Mike Hendrickson

State of the Computer Book Market, Part Four - Programming Languages Q1 07

In this fourth post [one, two and three are found here] on the State of the Computer Book Market, I will look at programming languages and drill in a little on each language area.

A Treemap view of the Programming Languages

Overall the Q1 '07 market for programming languages was down (9.10%) when compared with Q1 '06. There were 482,079 units sold in Q1 '06 versus 441,850 units sold in Q1 '07.

In the treemap view below, you will notice a couple of bright green areas -- namely Ruby and Transact SQL. JavaScript and Python, which are also green [not bright green] show a nice growth when compared to the Q1 '06 timeframe. Actionscript, VBScript and .Net Languages are the other languages showing growth in Q1 '07. The rest of the languages were either flat or down.

Languages Q1 07 Vs 06

Before I begin to drill in on the languages, I thought it would be best to explain our "language dimension." Our view on languages is not just strictly about programming with a particular language, although we capture those very easily, but that the book being categorized has code examples in a particular language. So Flash Programming with Java would be in our Flash atomic category, but the language dimension would be Java. Similarly, our Head First Design Patterns book contains all examples written in Java, so it too carries the "java" tag on the language dimension. So with this language dimension information in mind, I am going to add one more grouping before we dive in. For the sake of grouping and presenting this information in a more readable format, I have classified the categories for the languages in this way:

Category Q1 07 Unit Range
Major 22,000 - 65,000
Mid-Major 2,000 - 19,000
Mid-Minor 1,000 - 1,900
Minor 100 - 999
Irrelevant 0 - 99

Now let's dive into this treemap and take a closer look at the languages. The tables that I am showing will contain the following header:

*Major* U N I T S T I T L E S M A R K E T S H A R E P A R E T O
1. Language 2. 2006
3. 2007
4. 2006
5. 2007
6. 06Mkt
7. 07Mkt
8. Top 10 9. 20%

  1. Name or short name of the language
  2. Units sold in Q1 '06
  3. Units sold in Q1 '07
  4. Number of Titles making Bookscan 3000 in 2006
  5. Number of Titles making Bookscan 3000 in 2007
  6. 2006 Market Share
  7. 2007 Market Share
  8. Percent of total Language market for the Top 10
  9. 20% of the Titles make what percent of the Total

The pareto principle is roughly stated as the 80/20 rule where 20% of an economic distribution makes up 80% of the total. So in the case of computer books, that would mean that 20% of the titles in a particular language area would make about 80% of the total units. I also think it is important to look at the Top 10 titles and see what percentage they make of the total area being measured. Some markets are dominated by one or two books. The pareto is only useful when looking at the Major and Mid-Major languages because the other languages are typically one or two book markets. I have not included any pareto percentages on those smaller market language areas.

This chart shows the four-year trend for the Major programming languages. Red is used to highlight 2007.


The following table is the "raw" data for Q1 '07 and Q1 '06 for the major languages. As you can see, Ruby is a bright spot for the languages rights now. However, when you see that the number of Ruby titles increased from 7 in 2006 to 23 [ 300%] in 2007 and saw only a 41% unit growth, this only increased Ruby's market share by 2.6%. Sometimes the data behind growth charts reveal that things are not as impressive as they appear on the surface. The "Languages" are a big market and it will take some sustained heavy growth periods to get Ruby higher up on the scale. It may be a little unfair to pick on Ruby as it just made the cutoff of the Major languages threshold, yet it has rapid growth in the last couple of years. The rest of the languages, except for Javascript, are older and established, but all showed declines. (Note: ".Net Languages" refers to books that include both C# and Visual Basic. You could add those numbers to those for either of the other languages, but it is more likely that they should be counted for C#.)

Major Programming Languages

The major languages each sold more than 20,000 units in Q1 '07. These are what I consider the major languages.

*Major* U N I T S T I T L E S M A R K E T S H A R E P A R E T O
Language 2006
Top 10 20%
java 77,782 63,136 228 206 16.13% 14.29% 34% 68%
c# 53,855 52,655 120 116 11.17% 11.92% 33% 54%
javascript 34,741 48,266 40 74 7.21% 10.92% 50% 58%
php 59,524 41,933 67 78 12.35% 9.49% 53% 64%
c/c 53,232 41,311 185 163 11.04% 9.35% 29% 59%
.net languages 23,183 30,712 53 70 4.81% 6.95% 51% 59%
visual basic 44,401 26,385 112 95 9.21% 5.97% 41% 58%
ruby 15,089 25,380 7 23 3.13% 5.74% 82% 62%
sql 25,784 22,188 51 56 5.35% 5.02% 49% 61%

Notice in the table that the top 10 titles [pareto column] typically represent between 30 and 50 percent of the market for that specific language. Two noticeable exceptions are Ruby and C/C . C/C has more titles and is not heavily loaded on the "bestseller" side of things, but rather dispersed among many titles. Ruby on the other hand does not have as many titles, so the top 10 represents a high percent of the language's total titles as well as the units. From a title growth perspective, JavaScript had the largest increase in new titles making the Bookscan list.

Here are the top five titles for the major languages:

Pragmatic Agile Web Development with Rails
O'Reilly JavaScript: The Definitive Guide
O'Reilly Ajax on Rails
O'Reilly Head First Java
O'Reilly Head First Design Patterns

Mid-Major Programming Languages

The following languages all sold between 2,000 - 19,999 units in Q1 '07. These are what I am considering the mid-major languages.

*Mid-Major* U N I T S T I T L E S M A R K E T S H A R E P A R E T O
Language 2006
Top 10 20%
actionscript 16,769 18,531 29 26 3.48% 4.19% 88% 50%
vba 21,420 16,717 47 43 4.44% 3.78% 59% 52%
perl 14,415 10,308 41 33 2.99% 2.33% 100% 45%
python 9,170 9,909 22 23 1.90% 2.24% 81% 53%
transact sql 2,878 5,846 7 13 0.60% 1.32% 99% 50%
vbscript 3,022 3,308 7 7 0.63% 0.75% 100% 45%
shell script 4,294 2,935 12 10 0.89% 0.66% 100% 45%
basic 3,029 2,341 6 5 0.63% 0.53% 39% 58%
pl/sql 2,207 2,033 13 12 0.46% 0.46% 100% 20%

You'll notice in the Mid-Major languages that Python and Actionscript are the two languages that are showing growth when you compare the 2006 and 2007 first quarter. Python achieved its growth by adding one more title to the list and ActionScript grew with three fewer titles. In this language area, there are some languages that the top ten titles represent all of the titles for the language -- Perl, VBScript, Shell Script, and PL/SQL.

Here are the top titles for the mid-major languages.

O'Reilly Learning Perl
For Dummies Excel VBA Programming For Dummies
For Dummies Excel 2003 Power Programming with VBA
O'Reilly Learning Python
Microsoft Press Microsoft VBScript Step by Step

Mid-Minor Programming Languages

The following languages all sold between 1,000 - 1,900 units in Q1 '07. These are what I am considering the mid-minor languages.

*Mid-Minor* U N I T S T I T L E S M A R K E T S H A R E
Language 2006
windows script 3,771 1,859 7 5 0.78% 0.42%
powershell 527 1,827 1 5 0.11% 0.41%
groovy - 1,552 - 3 0.00% 0.35%
objective c 1,784 1,424 6 5 0.37% 0.32%
assembly 1,734 1,049 11 10 0.36% 0.24%

The noticeable trend with the mid-minor languages is that Groovy came from nowhere and has now sold 1500 copies in the first quarter of 2007. Also Powershell titles are starting to nudge the needle a bit.

Here are the top titles for the mid-minor languages.

Manning Windows PowerShell in Action
Manning Groovy in Action
Microsoft Press Microsoft Windows Scripting Self-Paced Learning Guide
Wiley Reversing: Secrets of Reverse Engineering
Course Technology Microsoft Windows Powershell Programming for the Absolute Beginner

Minor Programming Languages

The following languages all sold between 100 - 999 units in Q1 '07. These are what I am considering the minor languages.

*Minor* U N I T S T I T L E S M A R K E T S H A R E
Language 2006
Language 2006Units 2007Units 2006titles 2007Titles 06Mkt Share 07Mkt Share
sas 574 999 7 10 0.12% 0.23%
applescript 1,053 894 6 4 0.22% 0.20%
mdx 516 764 4 3 0.11% 0.17%
abap 538 624 1 2 0.11% 0.14%
latex 826 594 4 4 0.17% 0.13%
awk 889 580 2 2 0.18% 0.13%
lisp 759 557 4 4 0.16% 0.13%
lua 138 509 2 3 0.03% 0.12%
tcl 612 398 2 2 0.13% 0.09%
scheme 331 371 4 5 0.07% 0.08%
haskell 47 345 2 4 0.01% 0.08%
directx 552 310 1 1 0.11% 0.07%
mysql spl - 282 - 1 0.00% 0.06%
mel 541 260 3 2 0.11% 0.06%
vhdl 201 225 3 2 0.04% 0.05%
rpg 52 210 1 2 0.01% 0.05%
cobol 171 168 1 3 0.04% 0.04%
c 45 114 2 6 0.01% 0.03%

The noticeable trend in the minor languages is that Haskell is up, although there were two additional Haskell titles added to the space in Q1 '07. The four Haskell titles are averaging fewer than 30 copies per month.

Here are the top titles for the minor languages.

Wrox Professional SQL Server Analysis Services 2005 with MDX
O'Reilly sed & awk
Prentice Hall Practical Programming in Tcl and Tk
O'Reilly AppleScript: The Definitive Guide
Addison-Wesley ABAP Objects: Introduction to Programming SAP Applications

Irrelevant Programming Languages

The following languages all sold between 1 and 99 units in Q1 '07. These are what I am considering the irrelevant programming languages.

*Irrelevant* U N I T S T I T L E S M A R K E T S H A R E
Language 2006
alice 64 71 1 2 0.01% 0.02%
delphi 345 48 3 1 0.07% 0.01%
ocaml - 38 - 1 0.00% 0.01%
jcl 30 33 1 1 0.01% 0.01%
realbasic - 31 - 1 0.00% 0.01%
ada 26 11 2 1 0.01% 0.00%
labview 148 - 1 - 0.03% 0.00%
lingo 30 - 1 - 0.01% 0.00%
squeak 20 - 1 - 0.00% 0.00%
rexx 17 - 1 - 0.00% 0.00%
fortran 16 - 1 - 0.00% 0.00%

Here are the top titles for the irrelevant languages. Incidentally, each of these titles has sold less than 60 units each in the first quarter of 2007.

Prentice Hall Learning to Program with Alice
Wordware Inside Delphi 2006
Apress Practical OCaml
Mike Murach Murach's OS/390 and z/OS JCL
Apress Beginning REALbasic: From Novice to Professional

So this concludes the languages view of the State of the Computer Book Market. I hope you enjoyed it. Pay attention to this space, as I will be publishing this information quarterly. Now that I have all the queries, spreadsheets, pivot-tables and systems down, I should be able to update these posts much more easily going forward. If you have anything you would like explored a bit more thoroughly, please leave a comment here and I will see what I can do.

Previous  |  Next

Comments: 30

  Michael R. Bernstein [05.16.07 06:15 PM]

The number of additional Python titles seems low. Are you counting Django, Turbogears, Zope, and Plone books as Python books?

  MySchizoBuddy [05.16.07 06:31 PM]

Some typos
The Major- Minor table Should say

Major : 20,000 65,000
Mid Major : 2,000 19,999

  davidm [05.16.07 08:04 PM]

Keep pumping out those Ruby titles!

  Tim O'Reilly [05.16.07 08:32 PM]

Michael -- yes, the language dimension counts books that use Python as their language as Python books, regardless of their topic.

  Peter Szinek [05.17.07 01:14 AM]

Strange that Erlang is not even 'irrelevant' - there is at least one (supposedly quite good) title, 'Programming Erlang' from the Pragmatic programmers.

  Al [05.17.07 02:30 AM]

Mike this is fascinating

@Peter the Erlang book is available electronically (Beta) but not yet physically. On that I have a question.

Are you considering electronic sales as well as dead tree sales, it appears to me that programmers use more and more online resources/books as time progresses and it would be interesting to factor this in, particularly for those emerging languages?


  Aaron 'Teejay' Trevena [05.17.07 05:54 AM]

Nice to see that Learning Perl is a top selling book in the mid-major section - I keep on hearing the 'nobody is learning perl these days' FUD from python/ruby fanboys and I guess they're wrong :)

  Mike Hendrickson [05.17.07 05:57 AM]

Remember this data is for the Q1 '07 timeframe. The Prags book is not out in physical form yet -- looks like July.
We are not measuring online sales here as we have no idea what other publishers make in electronic sales.

  Aaron 'Teejay' Trevena [05.17.07 06:03 AM]

It would be really nice to see how these numbers work out cummulatively.

Because of the lack of historical data beyond changes year by year for the last couple of years, it's hard to see how many actual books are in circulation - for instance I borrowed Learning Perl from a friend when I started Perl, but then bought the Cookbook, The Camel book, Apache Modules with C and Perl, and recently more specialist titles such as HOP and OOP.


  Paul Mison [05.17.07 06:28 AM]

Great informative post, but the tables are distractingly Web 1.0. How about a small touch of prettifying CSS? This seems to work nicely for a first cut.

table { border: 1px solid black; }
tr { background: #ccc; }
td { border: 0px; padding: 1px; font-size:90%; }

  John Styles [05.17.07 02:32 PM]

I am probably not looking in the right place to find your detailed explanation, but when Tim said (in the post referenced from the first part ' O'Reilly Research has been loading the weekly top 3000 Computer Books into a MySQL-based data mart.') does that mean that every week the top 3000 are loaded so if a book on, say, Fortran was in the top 3000 for some weeks and not others then the total would only include the weeks in which it was in the top 3000?
What I am really asking is are there REALLY only a handful of Fortran books being bought every year, or are books / publishers / book-sellers being excluded? And if so, are the other figures just as bogus?

  Tim O'Reilly [05.17.07 04:07 PM]

John Styles --

Yes, the total would only include the weeks in which a book is actually in the top 3000. However, you should be aware that the #3000 book sells no more than perhaps 10 copies a week nationwide.

So that means that if a book doesn't appear in the top 3000, it means that it probably sold fewer than 500 copies in the course of a year as an upper bound.

We do also have a top 10,000 books report, but we don't usually work against it, because we haven't classified all the books at the lowest levels with the necessary level of accuracy to provide useful analysis.

I don't consider these figures bogus as a result.

  Tim O'Reilly [05.17.07 04:24 PM]

Peter Szinek --

Yes, bookscan tracks only print book sales through bookstores, not downloadable books, or even print book sales direct from publisher websites. Erlang will eventually show up. Only time will tell how important it will be.

  Tim O'Reilly [05.17.07 04:26 PM]

Paul, thanks for the suggestion about improving the tables. We took your suggestion and implemented it, and you're right, the tables look a lot better.

  Devdas Bhagat [05.18.07 12:34 AM]

Is there any change of getting global geographical distribution of book sales?

  John Styles [05.18.07 02:21 AM]

Tim :- thanks for that.

However, this does give some idea of the point at which the noise level at the bottom begins to crowd out useful information.

One could imagine a situation where language foo has 10 books all getting approximately the same percentage of the market share for people buying books on foo, slightly below this 500 a year threshold, meaning, say 10 * 450 = 4500 books are sold, which would put it half-way up the mid-major list and yet it wouldn't appear in these tables at all.

This seems intutively about right to me in that clearly you would put VBScript, Transact-SQL and everything above in some sort of 'major' list in which you would be certain that nothing not in that major list would be 'more major' (if you see what I mean), but below that threshold pretty much all bets would be off.

A point of note is that clearly your method favours minor languages with one or two dominant books.

  forthfreak [05.19.07 01:59 PM]

Hey what happened to FORTH? I love that language and was under the impression that it was among the top ten. This list is in error!

  Rafael Ferreira [05.19.07 05:40 PM]

Are textbooks counted in the programming languages category? For instance Cay Horstman's Big Java as a Java book or MIT Press' Structure and Interpretation of Computer Programs as a Scheme one and Concepts, Techniques and Models of Computer Programming as a Mozart/Oz book?

  Tim O'Reilly [05.20.07 07:59 AM]

Rafael --

Yes, they are, to the extent that they are sold through bookstores tracked by Bookscan (and I think most are.) We see a huge spike in books and topics that are the subject of courses at the appropriate times in the academic calendar. Now, it's certainly possible that there are large textbook adoptions that don't show up, but most should.

  Danny [05.20.07 12:24 PM]

Interesting data, but some of the classifications seem a bit dodgy. In particular, I'd suggest "Irrelevant" is very ill-advised, especially as it isn't qualified with for whom those languages are irrelevant. If you want to write avionics software, Ada is very relevant. Javascript was considered irrelevant by 'serious' programmers, until Ajax came along. What's more any half-decent programmer can learn a lot from learning a different language, whatever the popularity of that language.

(Nits: isn't C# a .Net language? The only Squeak I've heard of is an implementation of the Smalltalk language)

  Mike Hendrickson [05.21.07 06:02 AM]


I chose the word Irrelevant because these titles are, from a book sales perspective, irrelevant. I completely understand the relevance that all of these languages have and have had, but from a book sales perspective, they are what they are.

As far as C#, we have kept it separate intentionally as we have many VB.Net, ASP.Net, ADO.Net, and Visual Studio.Net titles to view in the .Net Languages category.

  Jim Kring [05.21.07 05:33 PM]

Hi Mike,

Great article.

I'm curious why you have no data for "LabVIEW" books in 2007 (and only one title in 2006). There are several LabVIEW books on the market, including one that I co-authored :-)

I believe that, according to your scheme, LabVIEW would easily be considered a "Minor" programming language.

  Mike Hendrickson [05.21.07 05:55 PM]


Notice that the sort order on the category is on 2007 Units. Since LabView has not made a top 300 report this year, it has no 2007 units and therefore it made the Irrelevant category.

  Jim Kring [05.21.07 06:13 PM]


Thanks for the explanation. I didn't understand how the data was being collected.

Also, did you mean "top 3000" (rather than "top 300")?

  Markus Mottl [05.23.07 09:05 AM]


there is good reason why some languages seem "irrelevant" in terms of book sales: you do not publish more book titles in these languages!

The sad fact is that O'Reilly has published a decent book on OCaml in French (a very small market!), but has refused to publish an English version already translated and proofread in a coordinated effort by around sixty qualified volunteers for free (see Developing Applications with Objective Caml). So much about public relations...

It completely amazes me how your marketing department could make such a horrible marketing blunder to publish a book in a language that does not even remotely have the reach of English as a language of computer science while not publishing an available, good English translation that was offered to you by the translators for free (!). Given that situation it doesn't make sense to label a language as "irrelevant", not even in terms of your book sales.

I happen to work for a Wall Street company that employs almost twenty OCaml developers full-time right now, and is quite likely to employ two or three times that many within the next five years if present growth rates continue. Our company alone would probably buy a whole bunch and several copies of good OCaml-books if they were available, even at higher prices, and we know of other companies using OCaml that would do the same.

There is currently no major publisher that publishes good and reasonably up-to-date books in this language. If your marketing department did its homework well and looked at where and how OCaml is being used rather than "irrelevant book" sales (or more fittingly: irrelevant "book sales"), they would not see this as a lack of interest but as an opportunity to take over a promising market before the competition does.

The book "Practical OCaml", which was published by Apress, is easily one of the most horrible books on a programming language to ever enter the market as evidenced by its devastating ratings on Amazon (I refused the publisher's offer to write a technical review): catastrophic layout, lots of technical mistakes, and weak language. This is quite obviously the only reason why this book has such weak sales, not because there is a lack of interest.

I can only hope that my comments will motivate you to have a serious chat with your marketing department on whether it might be more reasonable to provide better coverage in promising new markets with high margins rather than fighting teeth and claw over market share in established markets with low margins. Other languages on your list certainly do not deserve the label "irrelevant" either, neither by technical merit nor future promise of book sales, which is the only thing that should matter from a business point of view. Past sales do not.

  Mike Hendrickson [05.23.07 02:20 PM]

This data is not just about what O'Reilly publishes, but the whole market. The number of titles in an area is a reflection of what that given area can bear. I am interested in bringing the Practical OCaml book to market if it is truly ready to go. Please contact me at mikeh at oreilly dot com.

  Jason Prickard [05.24.07 09:02 AM]

@Marcus O'Reilly Media is no longer the underdog publisher it used to be. They are no longer willing to take the risks.

The problem you describe has a lot to do with the way ORM is structured: English-language titles have to be published by the Sebastopol office, which is getting less willing to take risks by the day. On the other hand, they waste a lot of money on silly stuff like MAKE.

It's a natural thing, the bigger you get, the more risk-awerse you get and the more diluted your offer becomes.

I watch ORM's growth with mixed feelings. The original spirit, the culture of ORA is gone. That's why they missed the rise of Ruby and Ruby on Rails, which they are now catching up with as fast as they can. And they will miss the next big thing, because they are separating themselves from the innovators whose knowlege they want to capture.

The solution is very simple: give the local offices the green light to make independent decisions, including the right to publish English-language titles. Turn them into independent publishers and implement a system that lets them inform each other what they are working on, so they don't compete against each other, yet they can make their own independent decisions that no other office can block. Otherwise ORM will be just like any other US corporation, clueless about the world abroad.

I see there's no EuroOSCON this year. Could this be the sign of too much power given to Sebastopol?

Will they change? I doubt it, but that's OK, they will be replaced by another underdog, and soon.

  Tim O'Reilly [05.24.07 09:36 AM]


I'm a bit perplexed by your comments. You say at the same time that we are "risk averse" and that we "waste money on silly stuff like Make." Which is it? We only go for the sure thing, or we really go out on a limb?

Make is a great example of an area where O'Reilly's gotten way out in front, and has invested heavily in something we believe in that's really up and coming: the new interpenetration of computing and the physical world.

Yes, we did take our eye off the ball with Ruby. But that's assuming that we can be everywhere at once, and do everything. We're still a small company with limited resources, and we do our best to catch the most interesting stuff. We published very early on Ruby, when it was going nowhere, and weren't paying attention when it started to take off.

But I think we've recovered nicely. The Prags, getting their first, have the top book on each of Ruby and Rails, but we hold the number 2, 3, and 4 spot on both. We also run RailsConf and RailsConf Europe, both huge successes.

Re. local offices publishing English language titles: they can and do.

But in the end, whether our local offices publish or our US offices publish, a book has to be something bookstores will buy and that will pay for itself. That's not as easy as you seem to think.

We had a lot of years where you could make money on practically any book, but as these reports show, the computer book market has been in a slump since 2001. We're investing in areas where we think we can make the biggest impact.

As to EuroOScon, we didn't find enough people excited enough for it to make the cut (limited resources again), so we replaced it with RailsConf Europe in our conference lineup. Meanwhile, we're taking the Web 2.0 Expo, another huge success, to both Japan and Berlin next year.

  Jason Prickard [05.24.07 04:22 PM]

@Tim OK. I'll admit that I tend to dismiss MAKE, because I don't get this idea of re-interpretation of computing and knitting :-)

To me, it's just a Dale's dream, much like Wozniak's rock festivals. But hey, it's your money!

What I'd like to see from ORM on the book publishing side would be books on truly obscure stuff like Sather, Eiffel, Erlang, Scheme, Lua, OCaml, Second Life scripting, etc. We've been waiting for those books from O'Reilly for years.

That would prove that ORM is still willing to take risks. If you get too obsessed with your book market reporting tools you will not be creating new markets like you did with Perl books, but following the herd. ORA had never been the one to chase the pack. On the contrary, ORA had built its reputation on being brave and publishing books on vi, termcap, or bison. ORM has yet to prove it can be equally bold.

"But in the end, whether our local offices publish or our US offices publish, a book has to be something bookstores will buy and that will pay for itself. That's not as easy as you seem to think." -- well, you didn't have any relationships with bookstores when you started. But you had customers--the community you stayed closely in touch with. If you stay closely in touch with the communities you want to sell your books to, you will sell your books even if the bookstores refuse to carry your books. And you will create new markets that will grow and buy more of your books, conferences, training, etc.

You used to brag about the number of copies of the vi book you sold showing other publishers that there are profits in sticking with the underdogs, now you watch the market and try to hedge your bets. That's the old way of playing the publishing game that you had been known to challenge.

Don't get me wrong. I admire your work and vision, and I wouldn't be typing this if it wasn't for ORA. I just want you to continue that way.

  Henning Panke [07.31.07 01:15 AM]

What is the meaning of "C/C"? Is this another writing for C++? Because I can't see cplusplus anywhere...

Post A Comment:

 (please be patient, comments may take awhile to post)

Type the characters you see in the picture above.