Amazon has added text concordances and statistics to all the books it has full text for. This is very cool! Not only do they have the statistically improbable phrases, which work well to tag the book, but they also have reading level, most common words, and even words-per-dollar and words-per-pound.
It’s great to see the wonks at Amazon going nuts with their data. I’m amazed that Google, with its vaunted philia for CS degrees, didn’t do it first with the Google Print corpus. Either way, we’re still in the early days–the information is very much a “by the way” side page, and not a primary navigation tool the way “best reviewed”, “bestselling”, and so on are sort options when viewing book search results.
In the near future I can easily imagine the “tags” (aka statistically improbable phrases) feeding into the category hierarchy and search results. If I search for “human neocortex”, I should get On Intelligence as a result because that book has “human neocortex” as a SIP.