ENTRIES TAGGED "language"

Four short links: 5 July 2011

Four short links: 5 July 2011

Organising Conferences, Moving to the JVM, Language Crowdsourcing, and Bayesian Computing

  1. Conference Organisers Handbook — accurate guide to running a two-day 300-person conference. See also Yet Another Perl Conference guidelines.
  2. Twitter Shifting More Code to JVM — interesting how, at scale, there are some tools and techniques of the scorned Enterprise that the web cool kids must turn to. Some. Business Process Workflow XML Schemas will never find love.
  3. Louis von Ahn on Duolingo — from the team that gave us “OCR books as you verify you are a human” CAPTCHAs comes “learn a new language as you translate the web”. I would love to try this, it sounds great (and is an example of what crowdsourcing can be).
  4. Fully Bayesian Computing (PDF) — A fully Bayesian computing environment calls for the possibility of defining vector and array objects that may contain both random and deterministic quantities, and syntax rules that allow treating these objects much like any variables or numeric arrays. Working within the statistical package R, we introduce a new object-oriented framework based on a new random variable data type that is implicitly represented by simulations. Perl made text processing easy because strings were first-class objects with a rich set of functions to operate on them; Node.js has a sweet HTTP library; it’s interesting to see how much more intuitive an algorithm becomes when random variables are a data type. (via BigData)
Comments Off |
Four short links: 27 May 2011

Four short links: 27 May 2011

Twitter DB, Data Reliance, Open Source Architectures, and Short-Form Bullying

  1. flockdb (Github) — Twitter’s open source scalable fault-tolerant distributed key-store graph database. (via Twitter’s open source projects page)
  2. How to Kill Innovation in Five Easy Steps (Tech Republic) — point four is interesting, Rely too heavily on data and dashboards. It’s good to be reminded of the contra side to the big-data-can-be-mined-for-all-truths attitudes flying around.
  3. Architecture of Open Source Applications — CC-licensed book available through Lulu or for free download. Lots of interesting stories and design decisions to draw from. I know when I learned how Perl worked on the inside, I learned a hell of a lot that I could apply later in life and respected its creators all the more.
  4. Bullying in 140 Letters — it’s about an Australian storm in a teacup, but it made me consider the short-form medium. Short-form negativity can have the added colour/resonance of being snarky and funny. Hard to add colour to short-form positive comments, though. Much harder to be funny and positive than to be funny and negative. Have we inadvertently created a medium where, thanks to the quirks of our language and the way we communicate, it favours negativity over positivity?
Comments: 3 |
Four short links: 19 May 2011

Four short links: 19 May 2011

Internet Access Rights, Statistical Peace, Vintage Jobs, and Errata Etymology

  1. Right to Access the Internet — a survey of different countries’ rights to access to access the Internet.
  2. Peace Through Statistics — three ex-Yugoslavian statisticians nominated for Nobel Peace Prize. In war-torn and impoverished countries, statistics provides a welcome arena in which science runs independent of ethnicity and religion. With so few resources, many countries are graduating few, if any, PhDs in statistical sciences. These statisticians collaboratively began a campaign to collect together the basics underlying statistics and statistics education, with the hope of increasing access to statistical ideas, knowledge and training around the world.
  3. Vintage Steve Jobs (YouTube) — he’s launching the “Think Different” campaign, but it’s a great reminder of what a powerful speaker he is and a look at how he thinks about marketing.
  4. Anatomy of a Fake Quotation (The Atlantic) — deconstructing how the words of a 24 year old English teacher in Japan sped around the world, attributed to Martin Luther King.
Comments Off |
Four short links: 12 May 2011

Four short links: 12 May 2011

One-Click Zeroed Down Under, Piracy, One Site To Rule Them All, and English Language

  1. Telsta Scores Patent Win over Amazon (ZDNet) — The delegate of the Commissioner of Patents, Ed Knock, found this week that Amazon’s 1-click buy facility “lacks novelty [and] an inventive step”, making Amazon’s claim unpatentable.
  2. The Final Answer for What To Do To Prevent Piracy (Jeff Vogel) — His advice is to do the minimum to encourage people to pay, as Anything beyond that will inconvenience your paying customers and do little to nothing to prevent piracy.
  3. alpha.gov.uk — an experimental prototype of a single interface to all government services. Governments have been trying these for years. This one’s different–it’s not built by the highest bidder, it’s the result of a lean team headed by the stellar Tom Loosemore (ex-BBC). It’s prototyping the idea of using lightweight reusable syndication-friendly components (decision trees, calculators, guides, etc.) to build such a site. My suspicion, though, is that government websites are a people problem not a technology problem.
  4. A StackExchange for the English Language — what’s the collective noun for pedants?
Comment: 1 |

Smarter search looks for influence rather than links

A Princeton search algorithm uses language indicators to measure importance.

A search algorithm being developed by Princeton University researchers parses language to determine relevance. Academic application is one possibility, but this type of algorithm could also extend to news recommendations.

Read Full Post | Comments Off |
Big Data: An opportunity in search of a metaphor

Big Data: An opportunity in search of a metaphor

Big data as a discipline or a conference topic is still in its formative years.

Big data is a massive opportunity, but the language used to describe it ("goldrush," "data deluge," "firehose," etc.) reveals we're still searching for its identity.

Read Full Post | Comments: 3 |
Four short links: 14 January 2011

Four short links: 14 January 2011

Borders, Monitoring, Data Visualization, and Localization

  1. What Went Wrong at Borders (The Atlantic) — a short summary of the decline and fall of Borders. Borders has a special place in our hearts at O’Reilly: it was a buyer for Borders who pointed out that Programming Perl was one of their top-selling books in any category, which got Tim focused on the Open Source story.
  2. Virtues of Monitoring — great explanation of the different levels of monitoring you could (and should) have in your application. (via Simon Willison)
  3. Getting Started with Processing and Data Visualization — a quick intro to building data visualizations with Processing. Nice variety in the examples, too. (via Hacker News)
  4. A Localization Horror Story — how hard it is to localize correctly. A wonderful article that is ruthlessly accurate in its descriptions of the pains of localizing software, which is no easier today despite the article being over a decade old.
Comments Off |
Four short links: 13 December 2010

Four short links: 13 December 2010

Mobile Clawback, Language Design, Gawker Hacked, and Science Tools

  1. European mobile operators say big sites need to pay for users’ data demands (Guardian) — it’s like the postal service demanding that envelope makers pay them because they’re not making enough money just selling stamps. What idiocy.
  2. Grace Programming Language — language designers working on a new teaching language.
  3. Gawker Media’s Entire Database Hacked — 1.5M usernames and passwords, plus content from their databases, in a torrent. What’s your plan to minimize the harm of an event like this, and to recover? (via Andy Baio)
  4. Macmillan Do Interesting Stuff (Cameron Neylon) — have acquired some companies that provide software tools to support scientists, and are starting a new line of business around it. I like it because it’s a much closer alignment of scientists’ interests with profit motive than, say, journals. Timo Hannay, who heads it, runs Science Foo Camp with Google and O’Reilly.
Comment: 1 |
Four short links: 2 August 2010

Four short links: 2 August 2010

Search Tips, Web Parsing, DNS Blacklists, Complex Machines

  1. Hidden Features of Google (StackExchange) — rather than Google’s list of search features, here are the features that real (sophisticated) users find useful. My new favourite: the ~ operator for approximate searching. (via Hacker News)
  2. Natural Language Parsing for the Web — JSON API to the Stanford Natural Language Parser. I wonder why the API to the library isn’t an open source library, given the Stanford parser is GPLv2. It’d be super-cool to have this as an EC2 instance, Ubuntu package, or Chef recipe so it’s trivial to add to an existing hosted project.
  3. Taking Back the DNS (Paul Vixie) — defining a spec whereby you can subscribe to blacklists for DNS, as Most new domain names are malicious.
  4. Building Complex Machines with Lego — I saw the (Lego) Antikythera Mechanism at Sci Foo. It’s as amazing as it looks.
Comments Off |
Four short links: 22 June 2010

Four short links: 22 June 2010

Fast Scans, Touch Screens, Privacy Newspeak, and Open Source Fonts

  1. High-Speed Book Scanner — you flip the pages, and it uses high-speed photography to capture images of each page. “But they’re all curved!” Indeed, so they project a grid onto the page so as to be able to correct for the curvature. The creator wanted to scan Manga, but the first publisher he tried turned him down. I’ve written to him offering a pile of O’Reilly books to test on. We love this technology!
  2. Magic Tables, not Magic Windows (Matt Jones) — thoughtful piece about how touch-screens are rarely used as a controller of abstract things rather than of real things, with some examples of the potential he’s talking about. When we’re not concentrating on our marbles, we’re looking each other in the eye – chuckling, tutting and cursing our aim – and each other. There’s no screen between us, there’s a magic table making us laugh. It’s probably my favourite app to show off the iPad – including the ones we’ve designed! It shows that the iPad can be a media surface to share, rather than a proscenium to consume through alone.
  3. Myths and Fallacies of Personally Identifiable Information — particularly relevant after reading Apple’s new iTunes privacy policy. We talk about the technical and legal meanings of “personally identifiable information” (PII) and argue that the term means next to nothing and must be greatly de-emphasized, if not abandoned, in order to have a meaningful discourse on data privacy. (via Pete Warden)
  4. Mensch Font — an interesting font, but this particularly caught my eye: Naturally I searched for a font editor, and the best one I found was Font Forge, an old Linux app ported to the Mac but still requiring X11. So that’s two ways OS X is borrowing from Linux for font support. What’s up with that? Was there an elite cadre of fontistas working on Linux machines in a secret bunker? Linux is, um, not usually known for its great designers. (via joshua on Delicious)
Comments: 2 |