Paul Graham recently wrote a couple of thought-provoking essays on the social and economic conditions that are most conducive to the formation of startup companies, How to Be Silicon Valley and Why Startups Condense in America.
Paul had sent one version of these essays to me for comment before publication, but unfortunately, I wasn’t able to send in a response in time to be useful. However, Paul also sent me a followup question in email, for which our research group just happens to have some good answers. Paul wrote: “In your opinion, which US cities are the biggest startup centers after Silicon Valley and Boston? Fred Wilson is trying to convince me NYC is no 3, and I just can’t believe it. Surely Austin and Seattle have more startups, no?”
Paul is right, according to Roger Magoulas, the head of O’Reilly Research. The San Francisco Bay Area (including Silicon Valley) is #1, followed by Boston at #2, Seattle/Tacoma at #3, and New York at #4. Paul’s wrong about Austin, though. It’s at #7, behind Washington D.C. and San Diego. Portland, Chicago, and Los Angeles round out the top ten. Here’s Roger’s graph showing the relative concentration of tech startups, based on online job listings.
As noted in previous blog posts, the O’Reilly Research data mart includes over fifty-one million job postings acquired through a data sharing agreement with SimplyHired. (See MySQL and the Database Job Market for another example of analysis based on this data.) Roger and his chief analyst, Ben Lorica, took a look at online job postings from January to May 2006. We used a de-duped subset of approximately 2.1 million job listings with reliable geo data for this study.
Roger describes the methodology as follows: “Start-ups were identified using regular expressions to match variants of ‘start up’ in the job posts. The data includes a small percentage of job posting that are not start-ups, [but] key phrase analysis showed less than .4% of start-up jobs were likely misidentified. We assume errors are randomly distributed in the data. Manual review of random sample passed the ‘smell test’, i.e., the job postings were for self-defined start-ups, based on job posting content and web site lookup. The methodology will miss start-ups that don’t self-identify themselves as start-ups in their job postings.”
Roger noted a couple of caveats:
- “Companies posting jobs on-line are past the ‘hire all your friends’ stage; our analysis covers [only] these more established and better funded start-ups
- Some large metro areas appear underrepresented in the geotagged jobs data. For example, Craigslist, one of the more reliable sources of geotagged job data, has primary sites in the following citites: San Francisco Bay Area, New York City, Boston, Seattle, San Diego, Portland, Chicago, Denver, Los Angeles, Washington DC. Dallas and Houston, the #5 and #7 metro areas in the US show few jobs on CraigsList..
- We performed two roll-ups, one exclusively for technical positions, the other excluding roles not deemed start-up appropriate, e.g., government, education, non-profit, retail. The exclusively technical jobs subset excludes 12.5% of all start-ups in the study. The inappropriate job role subset excludes 9% of all start-ups in the study.”
Roger concluded: “One can quibble with aspects of our methodology, but I believe it fairly represents start-up hiring activity by region and that our methodology doesn’t introduce any biases not already inherent in the underlying data. Ranks and scale are mostly reliable, i.e., San Francisco and Boston have more start-up job hiring activity than Seattle or New York City; Denver and Boulder results are close enough to Austin that they are equivalent.”
Here’s the graph for the second roll-up, which includes non-technical jobs:
To me, this graph suggests a fourth caveat. It seems to me unlikely that San Francisco has a higher rate of small business creation overall than New York, so my guess is that there is significant bias in the non-technical jobs study due to language. Far fewer companies outside of tech may refer to a new business as a “startup.” (That is, no one refers to a “start up dry cleaning business.”)
Finally, Roger and Ben also provided a graph showing what share of all jobs in each metro area were start-ups. He wrote: “Now we see that San Francisco, Boston, Austin and Seattle have the highest share of start-up on-line job postings.” So based on share of job postings, Austin makes the #3 spot. So it looks like Paul can win his bet with Fred using either Seattle or Austin after all! Here’s that final graph: