- Conference Organisers Handbook — accurate guide to running a two-day 300-person conference. See also Yet Another Perl Conference guidelines.
- Twitter Shifting More Code to JVM — interesting how, at scale, there are some tools and techniques of the scorned Enterprise that the web cool kids must turn to. Some. Business Process Workflow XML Schemas will never find love.
- Louis von Ahn on Duolingo — from the team that gave us “OCR books as you verify you are a human” CAPTCHAs comes “learn a new language as you translate the web”. I would love to try this, it sounds great (and is an example of what crowdsourcing can be).
- Fully Bayesian Computing (PDF) — A fully Bayesian computing environment calls for the possibility of defining vector and array objects that may contain both random and deterministic quantities, and syntax rules that allow treating these objects much like any variables or numeric arrays. Working within the statistical package R, we introduce a new object-oriented framework based on a new random variable data type that is implicitly represented by simulations. Perl made text processing easy because strings were first-class objects with a rich set of functions to operate on them; Node.js has a sweet HTTP library; it’s interesting to see how much more intuitive an algorithm becomes when random variables are a data type. (via BigData)
ENTRIES TAGGED "language"
Organising Conferences, Moving to the JVM, Language Crowdsourcing, and Bayesian Computing
Internet Access Rights, Statistical Peace, Vintage Jobs, and Errata Etymology
- Right to Access the Internet — a survey of different countries’ rights to access to access the Internet.
- Peace Through Statistics — three ex-Yugoslavian statisticians nominated for Nobel Peace Prize. In war-torn and impoverished countries, statistics provides a welcome arena in which science runs independent of ethnicity and religion. With so few resources, many countries are graduating few, if any, PhDs in statistical sciences. These statisticians collaboratively began a campaign to collect together the basics underlying statistics and statistics education, with the hope of increasing access to statistical ideas, knowledge and training around the world.
- Vintage Steve Jobs (YouTube) — he’s launching the “Think Different” campaign, but it’s a great reminder of what a powerful speaker he is and a look at how he thinks about marketing.
- Anatomy of a Fake Quotation (The Atlantic) — deconstructing how the words of a 24 year old English teacher in Japan sped around the world, attributed to Martin Luther King.
One-Click Zeroed Down Under, Piracy, One Site To Rule Them All, and English Language
- Telsta Scores Patent Win over Amazon (ZDNet) — The delegate of the Commissioner of Patents, Ed Knock, found this week that Amazon’s 1-click buy facility “lacks novelty [and] an inventive step”, making Amazon’s claim unpatentable.
- The Final Answer for What To Do To Prevent Piracy (Jeff Vogel) — His advice is to do the minimum to encourage people to pay, as Anything beyond that will inconvenience your paying customers and do little to nothing to prevent piracy.
- alpha.gov.uk — an experimental prototype of a single interface to all government services. Governments have been trying these for years. This one’s different–it’s not built by the highest bidder, it’s the result of a lean team headed by the stellar Tom Loosemore (ex-BBC). It’s prototyping the idea of using lightweight reusable syndication-friendly components (decision trees, calculators, guides, etc.) to build such a site. My suspicion, though, is that government websites are a people problem not a technology problem.
- A StackExchange for the English Language — what’s the collective noun for pedants?
A Princeton search algorithm uses language indicators to measure importance.
A search algorithm being developed by Princeton University researchers parses language to determine relevance. Academic application is one possibility, but this type of algorithm could also extend to news recommendations.
Big data as a discipline or a conference topic is still in its formative years.
Big data is a massive opportunity, but the language used to describe it ("goldrush," "data deluge," "firehose," etc.) reveals we're still searching for its identity.
Borders, Monitoring, Data Visualization, and Localization
- What Went Wrong at Borders (The Atlantic) — a short summary of the decline and fall of Borders. Borders has a special place in our hearts at O’Reilly: it was a buyer for Borders who pointed out that Programming Perl was one of their top-selling books in any category, which got Tim focused on the Open Source story.
- Virtues of Monitoring — great explanation of the different levels of monitoring you could (and should) have in your application. (via Simon Willison)
- Getting Started with Processing and Data Visualization — a quick intro to building data visualizations with Processing. Nice variety in the examples, too. (via Hacker News)
- A Localization Horror Story — how hard it is to localize correctly. A wonderful article that is ruthlessly accurate in its descriptions of the pains of localizing software, which is no easier today despite the article being over a decade old.
Mobile Clawback, Language Design, Gawker Hacked, and Science Tools
- European mobile operators say big sites need to pay for users’ data demands (Guardian) — it’s like the postal service demanding that envelope makers pay them because they’re not making enough money just selling stamps. What idiocy.
- Grace Programming Language — language designers working on a new teaching language.
- Gawker Media’s Entire Database Hacked — 1.5M usernames and passwords, plus content from their databases, in a torrent. What’s your plan to minimize the harm of an event like this, and to recover? (via Andy Baio)
- Macmillan Do Interesting Stuff (Cameron Neylon) — have acquired some companies that provide software tools to support scientists, and are starting a new line of business around it. I like it because it’s a much closer alignment of scientists’ interests with profit motive than, say, journals. Timo Hannay, who heads it, runs Science Foo Camp with Google and O’Reilly.
Search Tips, Web Parsing, DNS Blacklists, Complex Machines
- Hidden Features of Google (StackExchange) — rather than Google’s list of search features, here are the features that real (sophisticated) users find useful. My new favourite: the ~ operator for approximate searching. (via Hacker News)
- Natural Language Parsing for the Web — JSON API to the Stanford Natural Language Parser. I wonder why the API to the library isn’t an open source library, given the Stanford parser is GPLv2. It’d be super-cool to have this as an EC2 instance, Ubuntu package, or Chef recipe so it’s trivial to add to an existing hosted project.
- Taking Back the DNS (Paul Vixie) — defining a spec whereby you can subscribe to blacklists for DNS, as Most new domain names are malicious.
- Building Complex Machines with Lego — I saw the (Lego) Antikythera Mechanism at Sci Foo. It’s as amazing as it looks.
Fast Scans, Touch Screens, Privacy Newspeak, and Open Source Fonts
- High-Speed Book Scanner — you flip the pages, and it uses high-speed photography to capture images of each page. “But they’re all curved!” Indeed, so they project a grid onto the page so as to be able to correct for the curvature. The creator wanted to scan Manga, but the first publisher he tried turned him down. I’ve written to him offering a pile of O’Reilly books to test on. We love this technology!
- Magic Tables, not Magic Windows (Matt Jones) — thoughtful piece about how touch-screens are rarely used as a controller of abstract things rather than of real things, with some examples of the potential he’s talking about. When we’re not concentrating on our marbles, we’re looking each other in the eye – chuckling, tutting and cursing our aim – and each other. There’s no screen between us, there’s a magic table making us laugh. It’s probably my favourite app to show off the iPad – including the ones we’ve designed! It shows that the iPad can be a media surface to share, rather than a proscenium to consume through alone.
- Mensch Font — an interesting font, but this particularly caught my eye: Naturally I searched for a font editor, and the best one I found was Font Forge, an old Linux app ported to the Mac but still requiring X11. So that’s two ways OS X is borrowing from Linux for font support. What’s up with that? Was there an elite cadre of fontistas working on Linux machines in a secret bunker? Linux is, um, not usually known for its great designers. (via joshua on Delicious)