2010 State of the Computer Book Market, Post 4 – The Languages

In this fourth post (posts one, two and three are found here) on the State of the Computer Book Market, we will look at programming languages and drill in a little on each language area.

Overall, the market for programming languages was down -6.27% in 2010 when compared with 2009. There were 1,437,201 units sold in 2009 versus 1,374,922 units sold in 2010, which is a decrease of (62,279) fewer units. Java experienced the biggest gain in units, at 28,633 more units in 2010 than 2009, while PHP occupied the opposite end with the biggest decrease at 38,614 fewer units year-over-year.

Before we begin to drill in on the languages, we thought it would be best to explain our “language dimension.” When we group books by their language dimension, we categorize them by the language used in their code examples. So Flash Programming with Java would be in our Flash atomic category, but the language dimension would be Java. Similarly, our Head First Design Patterns book contains examples written in Java, so it too carries the “java” tag on the language dimension.

To provide some perspective, 2009 and 2010 have been the worst two years for book sales in the category of programming languages. The chart directly below does not include books that are method-oriented, about project management, about Consumer Operating Systems, or books without language-oriented material. So this is a different view of the market than the overall view found in Post 1 of this series. In the chart below you can see all languages on a week-by-week basis while showing that the Years 2009 and 2010 are consistently below prior years.

AllYearsLanguages.jpg

In 2008, we reported that C# surpassed Java as the number one language. But hold on, Java proved to be resilient in 2009 and experienced a resurgence in 2010 and is now the number one language from a book sales perspective. As you can see in the 2010 Top 20 langugages chart below, Java has a significant lead in the language race with Objective-C moving into third place closely behind C#.

2010 Market Share

Top20_langs_2010.jpg

If you look at the chart below, you will see which languages were responsible for the most units sold between 2004 and 2010. Newer languages, or “fad” languages may not be as well represented because they had less time to generate more significant units in our data set. The chart is basically the sum of units for each language during this time period. The top ten languages generated unit sales of 7,655,365 for the 7-year period, while the second ten generated 1,919,691 in the same period. The top ten languages represented roughly 80% of units sold during this period. Looking at the 7-year trend for the languages, you can see that C# had been steadily growing until 2009 while Java had been going in the opposite direction during the same period. In addition to Java, VBA, VBScript, SAS, Javascript, C++ and C showed growth from 2009 to 2010. The other 13 languages showed declines when comparing 2009 to 2010.

AllYearsT20Langs.jpg

A Treemap View of the Programming Languages

prog_lang_tree.jpg

In the treemap view above, which compares the last quarter of 2010 with the last quarter of 2009, you’ll notice a lot of bright green areas, several solid green areas and a fair share of black and red areas. The main reason Objective-C is down 12% is that it had a tremendous 2009, which was hard to sustain. The language came from a small speck on this treemap view, to occupying a fairly sizable square.

Before we dive in, let’s look at the high-level picture for the grouping of languages. I have grouped these languages by total number of units sold between 2004-2010. As you can see in the table below, only the Mid-Major group experienced growth in 2010, while the rest showed declines. The language driving the most growth in the Mid-Major area was R. An interesting observation is that the statistical languages, much like those you would be exposed to at our Strata Conference, are experiencing substantial growth. Namely, R, SAS, Matlab, Labview, Mathematica, and SPSS have collectively seen an increase of 49,504 units, or a whopping 102.87% growth. Maybe Hal Varian’s quip about Statistics being the “sexy job of the future” is motivating developers to learn these languages.

Group Unit Range Y2010 Units Y2009 Units Y2010 # Y2009 # 10MketShar 9MketShar
Large 50,000 &#151 200,000 1,051,945 1,069,762 1,590 1,433 75.96% 75.00%
Major 10,000 &#151 49,000 227,306 254,587 450 456 16.41% 17.85%
Mid-Major 3,000 &#151 9,999 53,152 44,909 104 85 3.84% 3.15%
Mid-Minor 1,682 &#151 2,999 20,818 20,965 61 58 1.50% 1.47%
Minor 1,000 &#151 1,680 13,000 15,517 46 31 0.94% 1.09%
Linelist 399 &#151 999 6,299 6,350 25 19 0.45% 0.45%
TheRest < 399 3,370 6,368 49 43 0.24% 0.45%

For the sake of grouping and presenting this information in a more readable format, we have classified the categories for the languages in this way with the following headers:

*Large* U N I T S T I T L E S M A R K E T S H A R E
1. Language 2. 2010 Units 3. 2009 Units 4. 2010 Titles 5. 2009 Titles 6. 10Mkt Share 7. 09Mkt Share
  1. Name or short name of the language
  2. Units sold in 2010
  3. Units sold in 2009
  4. Number of Titles making Bookscan 3000 in 2010
  5. Number of Titles making Bookscan 3000 in 2009
  6. 2010 Market Share
  7. 2009 Market Share

The following table contains data for the Large languages. As you can see, 5 of the 10 top languages experienced growth in 2010 and were led by Java’s impressive turnaround. As you may remember from previous posts, Java was on a steady decline in units sold, at least until 2009 and continuing through 2010. Could Android development be fueling this Java resurgence? Eventhough Objective-C experienced a decline in 2010 compared to 2009, it is amazing that it made the top ten. Previous rankings had the language near the 20th spot. Javascript continues its steady growth as it solidifies its spot as the most used/important language for web programming.

Large Programming Languages &#151 50,000 &#151 195,000 units in 2010

*Large* U N I T S T I T L E S M A R K E T S H A R E
Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
Java 194,520 165,887 361 332 13.90% 11.54%
C# 153,469 156,043 263 230 10.97% 10.86%
Objective C 136,711 141,608 89 51 9.77% 9.85%
JavaScript 131,850 115,107 169 157 9.42% 8.01%
PHP 106,952 145,566 163 152 7.64% 10.13%
C/C++ 94,268 93,067 192 184 6.74% 6.48%
VBA 61,108 48,507 68 58 4.37% 3.38%
ActionScript 60,578 83,017 96 85 4.33% 5.78%
Python 58,905 60,700 94 84 4.21% 4.22%
SQL 53,584 60,260 95 100 3.83% 4.19%

Here are the top titles for the Large languages. Incidentally, the titles and order are the same whether you look at units sold or dollars generated, except that the WordPress title falls out of the top five and Addison-Wesley’s PHP and MySQL Web Development moves to #5:

O’Reilly Learning PHP, MySQL, and JavaScript, First Edition
O’Reilly Head First Java, Second Edition
Wrox Professional Android 2 Application Development
Addison-Wesley Programming in Objective-C 2.0
Dummies WordPress for Dummies (covers PHP)

You’ll notice in the Major languages that C, Powershell, ShellScript, and VBscript all had growth. Overall, these languages sold roughly 27,000 fewer units in 2010 compared to 2009. That equates to a 12% decrease for the Major languages.

Major Programming Languages &#151 10,000 &#151 49,999 units in 2010

*Major* U N I T S T I T L E S M A R K E T S H A R E
Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
.NET Languages 44,958 57,286 82 78 3.25% 4.02%
Visual Basic 42,225 55,574 88 94 3.05% 3.90%
C 36,638 34,820 91 83 2.65% 2.44%
Ruby 20,004 29,977 48 63 1.44% 2.10%
Powershell 18,652 12,124 26 19 1.35% 0.85%
Transact SQL 17,507 17,601 28 29 1.26% 1.23%
Perl 15,606 20,030 32 34 1.13% 1.40%
Pl/Sql 10,670 10,974 24 26 0.77% 0.77%
Shell Script 10,720 7,482 20 17 0.77% 0.52%
VBScript 10,326 8,719 11 13 0.74% 0.61%

Here are the top titles for the Major languages.

Prentice Hall C Programming Language
Prentice Hall Practical Guide to Linux Commands, Editors, and Shell Programming
O’Reilly Learning Perl, 5th Edition
Morgan Kaufman Programming Massively Parallel Processors: A Hands-on Approach (C language)
Pragmatic Agile Web Development with Rails, Third Edition

Mid-Major Programming Languages &#151 3,000 &#151 9,999 units in 2010

The news in this category is that the statistical languages are doing really well. As noted above, these languages have grown by 102.87% from 2009 to 2010. The most impressive growth is for the eight titles for the R language: the overall category is led by R in a Nutshell.

*Mid-Major* U N I T S T I T L E S M A R K E T S H A R E
Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
SAS 9,035 7,974 27 21 0.65% 0.56%
SPSS 8,973 6,818 16 10 0.65% 0.48%
MatLab 7,857 6,752 22 17 0.57% 0.47%
R 7,800 2,817 15 12 0.56% 0.20%
Processing 6,996 6,038 8 6 .51% .42%
Shell Script 6,073 7,116 19 16 .44% .50%
Basic 5,540 5,277 7 9 .40% .37%
Lua 4,677 5,570 7 6 .34% .39%
Assembly 4,391 4,359 18 14 .32% .31%
MDX 3,890 4,838 8 8 0.28% 0.34%
UnrealScript 3,028 2,440 3 3 .22% .17%

Here are the top titles for the Mid-Major languages.

O’Reilly R in a Nutshell: A Desktop Quick Reference
Prentice Hall Using SPSS for Windows and Macintosh: Analyzing and Understanding Data
SAS Press The Little SAS Book: A Primer, Fourth Edition
Open University Press SPSS Survival Manual: A Step by Step Guide to Data Analysis Using SPSS for Windows
Sams Mastering Unreal Technology, Volume I: Introduction to Level Design with Unreal Engine 3

Mid-Minor &#151 1,682 &#151 2,999 units in 2010

The news in this category is the growth of functional languages, like F#, Scala, and Lisp. These languages showed a nice 51.38% year-over-year growth and generated 7,648 units in 2010, compared to 3,718 units in 2009.

*Mid-Minor* U N I T S T I T L E S M A R K E T S H A R E
Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
F# 2,905 1,095 6 5 0.21% 0.08%
Scala 2,531 3,946 5 5 0.18% 0.28%
Groovy 2,452 3,972 7 8 0.18% 0.28%
Alice 2,441 2,472 10 9 0.18% 0.17%
Blitzmax 1,836 2,603 2 2 0.13% 0.18%
AppleScript 1,787 3,994 4 6 0.13% 0.28%
VHDL 1,785 1,733 18 15 0.13% 0.12%
Bash 1,715 183 2 1 0.12% 0.01%
Lisp 1,684 309 4 6 0.12% 0.02%
LabView 1,682 658 3 1 0.12% 0.05%

Here are the top titles for the Mid-Minor languages.

Prentice-Hall Learning To Program with Alice
Artima Programming in Scala: A Comprehensive Step-by-step Guide
No Starch Press Land of Lisp: Learn to Program in Lisp, One Game at a Time!
Prentice-Hall LabVIEW 2009 Student Edition
Manning Real World Functional Programming: With Examples in F# and C#

Minor Languages &#151 1,000 &#151 1,680 units in 2010

This category of languages saw 6 of the 10 languages sell fewer units in 2010. There was roughly a 20% decrease in units sold year-over-year. The bright spot was the performance of Mathematica, mostly fueled by the Mathematica Cookbook. This area is dominated by functional languages like the previous category, however, these languages are not experiencing the substantial growth.

*Minor* U N I T S T I T L E S M A R K E T S H A R E
Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
Mathematica 1,675 900 9 4 0.12% 0.06%
Erlang 1,513 2,276 3 2 0.11% 0.16%
Scheme 1,479 1,364 8 7 0.11% 0.10%
FBML 1,367 2,335 5 4 0.10% 0.16%
Clojure 1,332 1,460 2 1 0.10% 0.10%
AWK 1,200 1,642 2 2 0.09% 0.12%
Nxt-g 1,172 969 4 1 0.08% 0.07%
Scratch 1,112 674 2 2 0.08% 0.05%
Latex 1,099 1,623 6 5 0.08% 0.11%
Haskell 1,051 2,274 5 3 0.08% 0.16%

Here are the top titles for the Minor languages.

O’Reilly Mathematica Cookbook
O’Reilly ERLANG Programming
O’Reilly Real World Haskell
Pragmatic Programming Clojure
O’Reilly sed & awk

Linelist &#151 399 &#151 999 units in 2010

This category of languages saw 6 of the 10 languages sell more units in 2010, although the sales volume is fairly insignificant. There was roughly a -0.81% decrease in units sold year-over-year. I am not going to list the bestsellers, because they are not exactly bestsellers in this sort of category.

*Linelist* U N I T S T I T L E S M A R K E T S H A R E
Language 2010 Units 2009 Units 2010 Titles 2009 Titles 10Mkt Share 09Mkt Share
Tcl 965 856 3 4 0.07% 0.06%
Stata 818 954 6 4 0.06% 0.07%
Peoplecode 702 444 2 1 0.05% 0.03%
Hla 625 0 1 0 0.05% 0.00%
Linden Script 623 1,695 4 3 0.04% 0.12%
D 604 0 1 0 0.04% 0.00%
Mel 587 1,022 5 4 0.04% 0.07%
Kml 531 973 1 1 0.04% 0.07%
Opengl Shader 445 406 1 2 0.03% 0.03%
Spin 399 0 1 0 0.03% 0.00%

TheRest Programming Languages &#151 < 400 units in 2010

Lastly, the following languages sold fewer than 400 units in 2010. Here is the list in descending order: autolisp, unity, x++, cfml, inform, mysql spl, blitz3d, q, nxt, gml, pure data, javafx, rpg, cobol, nxc, minitab, ml, boo, ada, fortran, octave, jcl, racket, jsl, idl, cfscript, abap, verilog, m, smalltalk, mumps, go, windows script, egl, c/al, realbasic, bondi, cl, cs2, eiffel, ocaml, and xquery.

Next up, Post 5 will look at digital sales.

State of the Computer Book Market 2011 — All five parts of Mike Hendrickson’s computer book market report are collected in this free ebook. Available in EPUB, Mobi and PDF formats.
tags: , ,
  • MySchizoBuddy

    So Ruby didn’t take over the world as was popularly predicted. what slowed it down?

  • http://about.me/lusis John E. Vincent

    It’s not that Ruby didn’t take over the world. It’s that O’Reilly simply isn’t the destination for people when looking for Ruby books (which is a shame).

    The Ruby book market is almost entirely dominated by PragProg. I still buy O’Reilly books but when I need something Ruby related, I head to PragProg.

  • http://webyog.com Peter Laursen

    The increase in the Powershell category indicates to me that finally people start understanding that Windows has (since Vista) a scripting environment equally strong as what Unix/Linux has (shell, perl).

  • Mike Hendrickson

    John E. Vincent, this is not about O’Reilly being a Ruby destination or not, but reflects the whole market, all publishers including the Prags. Sorry I did not make it explicit in this post, but the other Posts did indicate the data is for all publishers, NOT just O’Reilly.

  • Ben Tilly

    Is “units sold” physical books, or does this count electronic downloads? I’m specifically curious with how you count something like Safari which allows people to read what they need from many books without actually buying them.

  • Mike Hendrickson

    @Ben Tilly yes this is all about physical books sold in retail outlets and online retailers. Post 5, will have a bit of information about digital distribution. And as you indicate, Safari is a big part of digital books. It is O’Reilly’s second largest channel of distribution, but it is NOT included in this analysis.

  • http://overseas-exile.blogspot.com/ Ovid

    I’d also be curious to see the number of new titles per language. If a language is doing poorly but has new titles or is doing well but has few new titles, that would be interesting information.

  • phil

    It seems like books are playing a decreasing role. When I need to learn a new langauge, library or framework I’m increasingly turning to online resources: blogs, github, etc. It takes quite a bit of time to write, edit and publish a book. In my case I’ve been working with node.js. There are no books available for node.js – it’s only been around for what, 18months? And it’s been something of a moving target such that it would be difficult to write a book about it – the book would be out of date by the time it hit the shelves. So I’ve pretty much learned node.js by online resources. If a book did come out at this point, I probably wouldn’t need it anymore.

  • phil

    It seems like books are playing a decreasing role. When I need to learn a new langauge, library or framework I’m increasingly turning to online resources: blogs, github, etc. It takes quite a bit of time to write, edit and publish a book. In my case I’ve been working with node.js. There are no books available for node.js – it’s only been around for what, 18months? And it’s been something of a moving target such that it would be difficult to write a book about it – the book would be out of date by the time it hit the shelves. So I’ve pretty much learned node.js by online resources. If a book did come out at this point, I probably wouldn’t need it anymore.

  • http://amzn.to/d1Ci8A Matthew A. Russell

    To try and shed some additional light on the large decrease in the number of PHP units sold from the treemap (-37%), an interesting data point to consider that is that the TIOBE Index (an indicator of the popularity of programming languages) very recently showed Python gaining serious ground on PHP: as of Feb 2011; however, there’s still a need to explain the overall decrease in Python.

    My best guess as to why Python still shows a decrease of -9% in terms of units sold in spite of the TIOBE Index growth indicators could be because the online documentation for Python is extremely well done, and the language is easy enough to pick up that most people tend to use it versus purchasing books (at least in the circles I travel.) That said, I wouldn’t be the right person to comment on the quality of the online PHP documentation, and I suspect it’s quite good as well although the language is arguably a little less accessible/readable than Python.

  • Simon Hibbs

    I’m glad Python seems to have become the dominant general purpose scripting language as I switched to it from Perl in around 2003. Still, the decline in Ruby and Perl is alarming and I hope they at least stabilise at or above their current levels soon. They both have good communities and along with Lua they provide valuable choice in the scripting language space.

  • http://toddsnotes.blogspot.com/ todd kaufmann

    Is the raw data available?
    I think all the graphs are great,
    but wonder what insights others might come up with.

    You could have a contest…

  • Mike Hendrickson

    @Todd Kaufmann Unfortunately we cannot provide any raw data. Bookscan is gracious enough lettings us help inform the market about what is happening, but I’d rather not push it further. But there are other measures, have you seen TIOBE? I would imagine you could use GNIP to scrap Facebook, Twitter and LinkedIn to see what is most mentioned languages too… that would be interesting.

    @Simon Hibbs general purpose? Powershell had the best growth for a scripting language. It surpassed Perl, and is closing in on Ruby…it could be because of our awesome PowerShell Cookbook, 2/e

  • Anonymous

    Ruby doesnt need books to get developers going. You are only getting the late adopters in this thing, and Ruby dont need those

  • Simon Hibbs

    @Mike Hendrickson: The term general purpose is a controversial one when it comes to languages. However for me it means that the language is a credible contender for the development of complex applications in the web, GUI, database and system worlds.

    I know Powershell can throw up dialog boxes and reach into databases. I’m sure it can handle HTTP requests too. So can VBScript or bash. That doesn’t make them general purpose languages that anyone would seriously consider for the development of a complex GUI, database or web site. Powershell is much more modern than them though, so maybe I’m wrong to pigeonhole Powershell in this way though. e.g. Stack Overflow is written mostly in C# (I believe), but would you consider developing a site like that in Powershell?

  • Gary

    “…because the online documentation for Python is extremely well done.”
    Agreed.

    I remember my Java programming days, I had books for this and books for that. Great news for Amazon and O’Reilly, but bad news for me.
    The reason I bought all those books:
    o API that changed constantly
    o Seemed to be a (deliberate?) difficult to navigate website from Sun, which made having the latest book all that more essential.

    Some companies believe that winning hearts and minds of programmers requires a huge body of published paper/electronic books and use subscription based websites to tip that balance towards print.

    Other more enlightened companies see the value in long term relationships with the programmer and work hard on subscription free websites, allowing the publication market to manage itself naturally.

    I know which I prefer to program with :)

  • http://disweb.dis.unimelb.edu.au/staff/stevenbg/ Steve Goschnick

    >Similarly, our Head First Design Patterns book contains examples
    >written in Java, so it too carries the “java” tag

    … this being your stated method for tallying up Java books, if also applied to the SQL language, would have raised SQL to over 80,000 units in 2010, which would then place it at 7th in your ‘Large’ language group ( i.e. pl/SQL and Transact SQL, both have heavy lashings of/are founded upon the SQL language)

  • Florian

    Mike, thank you for providing all this insight.

    A question regarding your numbers: In Post 2, category family “Sys & Prog”, the units sold for the eight biggest publishers sum up to 1,828,478, which–according to your graph–is a market share of 92%. So the total number of units sold in this category family in 2010 would be close to 2 million. However, in Post 4 you say the total number of units sold for programming languages (which is only one part of the category family) would be close to 6 million. What is the reason for this discrepancy?

  • http://geekswithblogs.net/Designingcode Keith Nicholas

    Like others have suggested..

    I know from my point of view, I hardly need books on languages anymore. The online world often more than meets the need.

    Often only need books for something more obscure and less popular. like a books on Antlr (does that count as a java sale when I bought it for doing C#? :) )

  • Mike Hendrickson

    @Steve Goschnick We assign languages to help us understand what technologies are more popular to book buyers. In the case of the SQL variants we are at times interested in how the major SQL variants are faring against each other. You are correct to point out that grouping the SQL variants together makes sense and would make SQL the 7th ranked language based on book sales. [This also jives with our understanding of the continuing popularity of SQL compared to other programming languages].

    I will try to post a new chart aggregating all variants of SQL together so you can see their collective trend.

  • Mike Hendrickson

    @florian Thanks for catching my error. I had reported 7 years worth of data in the language number above, and have now corrected the error. 2009 was 1,437,201 and 2010 was 1,374,922 for (62,279) fewer units in 2010.