"text analysis" entries

Four short links: 24 May 2012

Four short links: 24 May 2012

Maker Tribe, Concept Mapping, Magic Wand, and Site Performance Matters

  1. Last Saturday My Son Found His People at the Maker Faire — aww to the power of INFINITY.
  2. Dictionaries Linking Words to Concepts (Google Research) — Wikipedia entries for concepts, text strings from searches and the oppressed workers down the Text Mines, and a count indicating how often the two were related.
  3. Magic Wand (Kickstarter) — I don’t want the game, I want a Bluetooth magic wand. I don’t want to click the OK button, I want to wave a wand and make it so! (via Pete Warden)
  4. E-Commerce Performance (Luke Wroblewski) — If a page load takes more than two seconds, 40% are likely to abandon that site. This is why you should follow Steve Souders like a hawk: if your site is slower than it could be, you’re leaving money on the table.
Comment: 1
Four short links: 19 April 2012

Four short links: 19 April 2012

Text Similarity, Designing Engagement, Clustering Stories, and Prince of Persia

  1. Superfastmatch — open source text comparison tool, used to locate plagiarism/churnalism in online news sites. You can pull out the text engine and use it for your own “find where this text is used elsewhere” applications (e.g., what’s being forwarded out in email, how much of this RFP is copy and paste, what’s NOT boilerplate in this contract, etc.). (via Pete Warden)
  2. Ten Design Principles for Engaging Math Tasks (Dan Meyer) — education gold, engagement gold, and some serious ideas you can use in your own apps.
  3. Clustering Related Stories (Jenny Finkel) — description of how to cluster related stories, talks about some of the tricks. Interesting without being too scary.
  4. Prince of Persia (GitHub) — I have waited to see if the novelty wore off, but I still find this cool: 1980s source code on GitHub.
Comment

Visualization of the Week: Anachronistic language in “Mad Men”

A look at the historical accuracy of "Mad Men's" dialogue.

"Mad Men" is praised for its precise attention to historical visuals, but how does its dialogue stack up against text from the 1960s? Ben Schmidt's new visualization explores that question.

Comment: 1
Visualization of the Week: Anachronistic language in "Mad Men"

Visualization of the Week: Anachronistic language in "Mad Men"

A look at the historical accuracy of "Mad Men's" dialogue.

"Mad Men" is praised for its precise attention to historical visuals, but how does its dialogue stack up against text from the 1960s? Ben Schmidt's new visualization explores that question.

Comment: 1
Four short links: 23 March 2012

Four short links: 23 March 2012

Caching Pages, Node NLP, Digital Native are Clueless, and Wal-Mart Loves Your Calendar

  1. Cache Them If You Can (Steve Souders) — the percentage of resources that are cacheable has increased 4% during the past year. Over that same time the number of requests per page has increased 12% and total transfer size has increased 24%.
  2. Natural — MIT-licensed general natural language facility for nodejs. Tokenizing, stemming, classification, phonetics, tf-idf, WordNet, string similarity, and some inflection are currently supported. (via Javascript Weekly)
  3. How Millennials SearchStatistically significant findings suggest that millennial generation Web searchers proceed erratically through an information search process, make only a limited attempt to evaluate the quality or validity of information gathered, and may perform some level of ‘backfilling’ or adding sources to a research project before final submission of the work. Never let old people tell you that “digital natives” actually know what they’re doing.
  4. Walmart Buys A Facebook App for Calendar Access (Ars Technica) — The Social Calendar app and its file of 110 million birthdays and other events, acquired from Newput Corp., will give Walmart the ability to expand its efforts to dig deeper into the lives of customers. Interesting to think that by buying a well-loved app, a company could get access to your Facebook details whether you Like them or not.
Comment
Four short links: 16 February 2012

Four short links: 16 February 2012

Wikipedia Fail, DIY Text Adventures, Antisocial Software, and Formats Matter

  1. The Undue Weight of Truth (Chronicle of Higher Education) — Wikipedia has become fossilized fiction because the mechanism of self-improvement is broken.
  2. Playfic — Andy Baio’s new site that lets you write text adventures in the browser. Great introduction to programming for language-loving kids and adults.
  3. Review of Alone Together (Chris McDowall) — I loved this review, its sentiments, and its presentation. Work on stuff that matters.
  4. Why ESRI As-Is Can’t Be Part of the Open Government Movement — data formats without broad support in open source tools are an unnecessary barrier to entry. You’re effectively letting the vendor charge for your data, which is just stupid.
Comment
Four short links: 8 February 2012

Four short links: 8 February 2012

Text Mining, Unstoppable Sociality, Unicode Fun, and Scholarly Publishing

  1. Mavunoan open source, modular, scalable text mining toolkit built upon Hadoop. (Apache-licensed)
  2. Cow Clicker — Wired profile of Cowclicker creator Ian Bogost. I was impressed by Cow Clickers [...] have turned what was intended to be a vapid experience into a source of camaraderie and creativity. People create communities around social activities, even when they are antisocial. (via BoingBoing)
  3. Unicode Has a Pile of Poo Character (BoingBoing) — this is perfect.
  4. The Research Works Act and the Breakdown of Mutual Incomprehension (Cameron Neylon) — an excellent summary of how researchers and publishers view each other and their place in the world.
Comment
Four short links: 13 January 2012

Four short links: 13 January 2012

Internet in Culture, Flash Security Tool, Haptic E-Books, and Facebook Mining Private Updates

  1. How The Internet Gets Inside Us (The New Yorker) — at any given moment, our most complicated machine will be taken as a model of human intelligence, and whatever media kids favor will be identified as the cause of our stupidity. When there were automatic looms, the mind was like an automatic loom; and, since young people in the loom period liked novels, it was the cheap novel that was degrading our minds. When there were telephone exchanges, the mind was like a telephone exchange, and, in the same period, since the nickelodeon reigned, moving pictures were making us dumb. When mainframe computers arrived and television was what kids liked, the mind was like a mainframe and television was the engine of our idiocy. Some machine is always showing us Mind; some entertainment derived from the machine is always showing us Non-Mind. (via Tom Armitage)
  2. SWFScan — Windows-only Flash decompiler to find hardcoded credentials, keys, and URLs. (via Mauricio Freitas)
  3. Paranga — haptic interface for flipping through an ebook. (via Ben Bashford)
  4. Facebook Gives Politico Deep Access to Users Political Sentiments (All Things D) — Facebook will analyse all public and private updates that mention candidates and an exclusive partner will “use” the results. Remember, if you’re not paying for it then you’re the product and not the customer.
Comment: 1
Four short links: 12 January 2012

Four short links: 12 January 2012

Smart Meter Snitches, Company Culture, Text Classification, and Live Face Substitution

  1. Smart Hacking for Privacy — can mine smart power meter data (or even snoop it) to learn what’s on the TV. Wow. (You can also watch the talk). (via Rob Inskeep)
  2. Conditioning Company Culture (Bryce Roberts) — a short read but thought-provoking. It’s easy to create mindless mantras, but I’ve seen the technique that Bryce describes and (when done well) it’s highly effective.
  3. hydrat (Google Code) — a declarative framework for text classification tasks.
  4. Dynamic Face Substitution (FlowingData) — Kyle McDonald and Arturo Castro play around with a face tracker and color interpolation to replace their own faces, in real-time, with celebrities such as that of Brad Pitt and Paris Hilton. Awesome. And creepy. Amen.
Comment: 1
The hidden language and "wonderful experience" of product reviews

The hidden language and "wonderful experience" of product reviews

Panagiotis Ipeirotis on the phrases and formatting of effective product reviews.

How much is an Amazon review — good or bad — worth? Computer scientist and NYU professor Panagiotis Ipeirotis analyzed the text in thousands of Amazon reviews to find out.

Comments: 3