Ben Lorica
Ben Lorica is the Senior Analyst in the Market Research Group at O'Reilly Media, Inc.. He has applied Business Intelligence, Data Mining and Statistical Analysis in a variety of settings including Direct Marketing, Consumer and Market Research, Targeted Advertising, Text Mining, and Financial Engineering. His background includes stints with an investment management company, internet startups, and financial services. At O'Reilly, Ben works on custom research and consulting projects, open source data warehousing and analytics.
An ex-academic, he was an Assistant Professor at U.C. Davis and was the founding Department Chair for Statistics and Mathematics at C.S.U. Monterey Bay.
Wed
May 7
2008
Macs in the Enterprise
In every O’Reilly conference and event I've been to, the number of Mac laptops is disproportionately high: I would say at least around a quarter (if not more) in most of our conferences. The most common answer I hear is that the Mac combines an elegant UI, a suite of useful software, and a Unix command line. O'Reilly does tilt towards the “alpha geek” crowd, but one wonders if mainstream companies are beginning to allow Apple products (including iPhones) in their networks.
Business Week’s most recent cover story is on the growing interest in Apple computers among corporate users. I was expecting the article to include some estimates for the corporate market, or at least the results of a recent survey. It was after all the cover story of the U.S. edition.
I do recognize that estimating Apple’s share of the corporate market is difficult. Apple itself does not provide corporate sales estimates and according to the article, it doesn't even have much of a sales force dedicated to the space. What Apple provides are sales for Desktops and Portable PC’s:

Starting Q3-2006, the share of portables jumped to 60% and has remained slightly above that number. Apple began moving to Intel processors in Q1-2006 and by August 2006 the entire line of Apple PC's had switched over. The graph for revenues (Desktops vs. Portables) is essentially the same. In Q2-2008, portables grew 61%, compared to the prior year, and now account for close to 2 in 3 units sold.
As the author points out, Apple’s secrecy and large margins may hamper it in the corporate market, where buyers prefer transparency and bargains. Overseas, particularly in the developing world, Macs are too expensive for most. With the introduction of expensive models (e.g. MacBook Air) the article estimates that the average price for a Mac is now about $1,526: too pricey even for large American companies, unless of course Apple is willing to forgo their fat margins and negotiate. Why would Apple want that when consumers seem willing to pay for their products? With more and more tasks moving to the cloud, expensive Macs will be even harder to justify. So while more companies might be willing to allow Macs, I would be surprised if Apple makes inroads in the corporate space.
My pet peeve: MS-Excel 2008 for the Mac is quite unstable, and I think the 2004 version is superior. In the corporate market a stable and easy-to-use spreadsheet is a must.
If you work for a large company they probably tightly limit what machines you can use. Luckily for me, O’Reilly allows the use of any (Mac, Windows, Linux, BSD, ) computer.
tags:
| comments: 9
| Sphere It
submit:
Fri
May 2
2008
Facebook App Categories Ranked By Usage
We have been tracking the usage in each individual Facebook application since the launch of their platform, so I have been following the discussion questioning the utility of the majority of applications published to date. A lot of Facebook applications are perceived as "time-wasters", but I should caution that the number of apps in a category do not translate directly into active users:

As an example there are much fewer Dating apps than Sports apps, but Dating apps generate far more active users. Moreover, Messaging generates more active users than other "less useful" categories, and has grown the fastest over the last month:

Developers select the categories for their applications, so besides double-counting apps that are assigned multiple categories, inconsistencies in how the developers assign their apps to categories affect the results. We addressed some of these issues by categorizing the top applications ourselves. For more on the Facebook Application Platform, check the most recent edition of our research report. Also, Roger Magoulas of O'Reilly Research will present some of our most recent findings at the upcoming Graphing Social Patterns conference.
tags:
| comments: 8
| Sphere It
submit:
Tue
Apr 29
2008
Inside Innovation at Xerox PARC
We were part of a group of journalists and bloggers invited to hear presentations from 10 different research groups within various parts of Xerox, PARC, and Fuji-Xerox. The format was similar to a science fair or a poster session in an academic conference with small groups moving around to hear presentations from the different projects. While other research labs use a large auditorium and parade different researchers in, I thought the smaller, science fair format made for better interactions between the visitors and the researchers.
We saw early prototypes created by the researchers themselves, so the user interfaces were far from polished. Here are some of the highlights from our visit:
Seamless Document Viewer

A J2ME application designed to help solve the problem of viewing documents on small screens (cell phones and other mobile devices), this app automatically segments a document into blocks and displays the keyphrase for each block. The keyphrases are intended to help users navigate to sections of interest quickly. The cell phone demo we saw used a fairly intuitive touchscreen interface that included an interesting way to pan and zoom in and out of sections of a document. Because documents viewed through the application need to be processed and analyzed in advance, it is better suited for viewing PDF's and static documents, not frequently updated web pages.
Hybrid Categorization
Categorizing documents automatically is an old topic in information science. Most tools rely only on the text portion of documents and use a combination of Natural Language Processing and Machine Learning. I was looking forward to this presentation because we use text-only automatic classifiers to help organize some of our data sources.
Hybrid categorization uses both the text and images contained in documents. It isn't clear how scalable their hybrid categorizer is, the results we saw were based on small numbers of documents. Precision measures the accuracy of a categorizer and judging from the results of an academic competition, Xerox' hybrid (text +images) approach may hold some promise.

Erasable Paper
"Reusable paper" refers to paper coated with special materials and a custom printer that shoots UV light onto it. The resulting printed document is designed to fade within 24 hours and the paper can be reused and fed into the printer multiple (10+) times. The printer can even erase the printing on the specially-coated papers, and print an entirely new document on the same sheets of paper. We raised the possibility that a sheet of paper that has nominally erased itself can be reverse engineered to reveal sensitive content: think security agencies or dumpster-diving identity thieves. Surprisingly, the researchers had not seriously investigated the possibility of "recovering erased documents".
The cost of the specially-coated paper is projected to be only 2-3 time the cost of normal paper, while the accompanying printer will cost about the same as a laser printer. Since paper can be reused multiple (10+) times, the obvious environmental benefits also lead to savings. Further savings come from the design of the printer itself: since the printing is done with light (UV LED bar), the printer does not use ink or toner.
Intelligent Redaction
Redaction is the process of removing sensitive information from documents. Popular examples include government/intelligence documents released to the public and medical records. Text redaction is normally a tedious manual process that requires staff possessing significant domain expertise. As an example, privacy rules governing medical records in the U.S. requires redaction of terms associated with HIV/AIDS, mental health and drug/alcohol problems. In the demo we saw, the software tool examined a corpus of documents, automatically came up with terms/phrases associated with the listed illnesses, and redacted them from every document in the corpus.
Other Notables
tags:
| comments: 8
| Sphere It
submit:
Fri
Apr 25
2008
Virtual Worlds & the Cognitive Surplus
How much work went into producing all the (language) versions of Wikipedia? The answer: much less than the total number of hours Americans spent watching TV over the last year. Listening to Clay Shirky estimate the amount of untapped cognitive cycles in his Web 2.0 keynote, reminded me of a similar calculation we did early last year. Amidst the flurry of media stories about Second Life, my immediate reaction to the hype surrounding Second Life was to compare it to the benchmark that Clay cited in his talk: hours of TV viewing per capita.

Nevertheless, the media stories combined with signals from a few of our technical indicators (online job postings, book sales, message lists, ...) encouraged us to dig deeper into Virtual Worlds. We spent time in the latter part of 2007 understanding Virtual Worlds and earlier this year we released a Business Guide. This Sunday (10 a.m. PDT), I will be speaking in Second Life as part of the Virtual Business Expo and I plan to briefly discuss some of our most recent findings. The organizers are encouraging people to register in advance and you can do so here. Hope to see you in-world this Sunday.
tags:
| comments: 8
| Sphere It
submit:
Wed
Apr 23
2008
RSA 2008
Bruce Schneier's post about the recent RSA conference made me realize that my reaction to walking the exhibition hall may have been the norm. I have talked to enough security vendors to know that their basic message is constant: (1) security threats are extremely serious and rapidly growing problems, (2) their innovative solutions will render most of these threats harmless. What struck me was how intensely these two points were being delivered.
Imagine walking a huge exhibition floor housing multiple solutions to just about every security problem, listening to vendors review how serious the threats are, then being told multiple times that a particular solution is the most effective way to neutralize those threats. I can see why some attendees get shellshocked, and as Bruce observed, less likely to buy. I wasn't there as a buyer, but the overall fervor was unusual enough that I relayed it to a few friends shortly after.
Because of schedule conflicts, I walked around on the fourth day of a five day conference and by then some vendors were probably aiming to book sales and identify prospects. Next year I'll try to check out the exhibition hall earlier in the week.
tags:
| comments: 3
| Sphere It
submit:











