"science" entries

Four short links: 11 June 2014

Four short links: 11 June 2014

Right to Mine, Summarising Microblogs, C Sucks for Stats, and Scanning Logfiles

  1. UK Copyright Law Permits Researchers to Data Mine — changes mean Copyright holders can require researchers to pay to access their content but cannot then restrict text or data mining for non-commercial purposes thereafter, under the new rules. However, researchers that use the text or data they have mined for anything other than a non-commercial purpose will be said to have infringed copyright, unless the activity has the consent of rights holders. In addition, the sale of the text or data mined by researchers is prohibited. The derivative works will be very interesting: if university mines the journals, finds new possibility for a Thing, is verified experimentally, is that Thing the university’s to license commercially for profit?
  2. Efficient Online Summary of Microblogging Streams (PDF) — research paper. The algorithm we propose uses a word graph, along with optimization techniques such as decaying windows and pruning. It outperforms the baseline in terms of summary quality, as well as time and memory efficiency.
  3. Statistical Shortcomings in Standard Math Libraries — or “Why C Derivatives Are Not Popular With Statistical Scientists”. The following mathematical functions are necessary for implementing any rudimentary statistics application; and yet they are general enough to have many applications beyond statistics. I hereby propose adding them to the standard C math library and to the libraries which inherit from it. For purposes of future discussion, I will refer to these functions as the Elusive Eight.
  4. fail2ban — open source tool that scans logfiles for signs of malice, and triggers actions (e.g., iptables updates).
Four short links: 18 April 2014

Four short links: 18 April 2014

Interview Tips, Data of Any Size, Science Writing, and Instrumented Javascript

  1. 16 Interviewing Tips for User Studies — these apply to many situations beyond user interviews, too.
  2. The Backlash Against Big Data contd. (Mike Loukides) — Learn to be a data skeptic. That doesn’t mean becoming skeptical about the value of data; it means asking the hard questions that anyone claiming to be a data scientist should ask. Think carefully about the questions you’re asking, the data you have to work with, and the results that you’re getting. And learn that data is about enabling intelligent discussions, not about turning a crank and having the right answer pop out.
  3. The Science of Science Writing (American Scientist) — also applicable beyond the specific field for which it was written.
  4. earhornEarhorn instruments your JavaScript and shows you a detailed, reversible, line-by-line log of JavaScript execution, sort of like console.log’s crazy uncle.
Four short links: 16 April 2014

Four short links: 16 April 2014

Time Series, CT Scanner, Reading List, and Origami Microscope

  1. morris.jspretty time-series line graphs.
  2. Open Source CT Scanner — all the awesome.
  3. Alan Kay’s Reading List — in case you’re wondering what to add to the pile beside your bed. (via Alex Dong)
  4. Foldscope — origami optical microscope, 2000x magnification for under $1.
Four short links: 9 April 2014

Four short links: 9 April 2014

Internet of Listeners, Mobile Deep Belief, Crowdsourced Spectrum Data, and Quantum Minecraft

  1. Jasper Projectan open source platform for developing always-on, voice-controlled applications. Shouting is the new swiping—I eagerly await Gartner touting the Internet-of-things-that-misunderstand-you.
  2. DeepBeliefSDK — deep neural network library for iOS. (via Pete Warden)
  3. Microsoft Spectrum Observatory — crowdsourcing spectrum utilisation information. Just open sourced their code.
  4. qcraft — beginner’s guide to quantum physics in Minecraft. (via Nelson Minar)
Four short links: 27 March 2014

Four short links: 27 March 2014

Understanding Image Processing, Sharing Data, Fixing Bad Science, and Delightful Dashboard

  1. 2D Image Post-Processing Techniques and Algorithms (DIY Drones) — understanding how automated image matching and processing tools work means you can also get a better understanding how to shoot your images and what to prevent to get good matches.
  2. Scientists Need to Learn to Sharedespite science’s reputation for rigor, sloppiness is a substantial problem in some fields. You’re much more likely to check your work and follow best data-handling practices when you know someone is going to run your code and parse your data.
  3. METRICSMeta-Research Innovation Center at Stanford. John Ioannidis has a posse: connecting researchers into weak science, running conferences, creating a “journal watch”, and engaging policy makers. (says The Economist)
  4. Grafana — elegant dashboard for graphite (the realtime data graphing engine).
Four short links: 12 March 2014

Four short links: 12 March 2014

Web Past, Web Future, Automated Jerkholism, and Science Education

  1. High Volume Web Sites — Tim Berners-Lee answers my question on provisioning a popular web server in 1993. The info.cern.ch server which has the Subject Catalogue gets probably a relatively high usage, about 10k requests a day, or (thinks…) one every 9 seconds. the CPU load is negligible. In fact of course the peak rate is higher, but still its not really a factor. That was when the server forked a subprocess for each request, too. See also one of my early contributions to the nascent field of web operations (language alert).
  2. Tim Berners-Lee Calls For Web Magna Carta (Guardian) — Unless we have an open, neutral internet we can rely on without worrying about what’s happening at the back door, we can’t have open government, good democracy, good healthcare, connected communities and diversity of culture. It’s not naive to think we can have that, but it is naive to think we can just sit back and get it.
  3. BroAppAutomatically message your girlfriend sweet things so you can spend more time with the Bros. Reminds me of the Electric Monk in Dirk Gently’s Holistic Detective Agency. The monk notices that humans have machines to watch TV for them. Now we have machines to be shitty boyfriends for us. (via Beta Knowledge)
  4. World Science U — quick answers, short courses, long MOOCs. I wonder how you’d know whether this was effective at increasing scientific literacy, and therefore whether it’d be worth doing for computational thought or programming.
Four short links: 28 February 2014

Four short links: 28 February 2014

Minecraft+Pi+Python, Science Torrents, Web App Performance Measurement, and Streaming Data

  1. Programming Minecraft Pi with Python — an early draft, but shows promise for kids. (via Raspberry Pi)
  2. Terasaur — BitTorrent for mad-large files, making it easy for datasets to be saved and exchanged.
  3. BuckyOpen-source tool to measure the performance of your web app directly from your users’ browsers. Nifty graph.
  4. Zoe Keating’s Streaming Payouts — actual data on a real musician’s distribution and revenues through various channels. Hint: streaming is tragicomically low-paying. (via Andy Baio)

DIYbio and the hacking metaphor

Definitive answers require further testing

The following is from the second issue of BioCoder, the quarterly newsletter for synthetic biologists, DIY biologists, neurobiologists, and more. Download your free copy today.


Within DIYbio, one cannot escape the hacking metaphor. The metaphor is ubiquitous and, to a point, useful. The term connotes both productive play with an existing technology aimed at improvement and, at the same time, play with sinister undertones. In this sense, hacking captures the promise and pitfalls of the dual uses any mature technology might be put to, whether that technology is as dramatic as nuclear power/weapons or as mundane as a free/premium software license. But every metaphor has its limits. Pushed too far, metaphors break down, and instead of illuminating, they obscure. Which brings me to ask: how far can the hacking metaphor be pushed within DIYbio—at least the part of DIYbio falling in line with synthetic biology?

Read more…

Four short links: 30 January 2014

Four short links: 30 January 2014

In-Game Economy, AI Ethics, Data Repository, and Regulated Disruption

  1. $200k of Spaceships Destroyed (The Verge) — More than 2,200 of the game’s players, members of EVE’s largest alliances, came together to shoot each other out of the sky. The resultant damage was valued at more than $200,000 of real-world money. […] Already, the battle has had an impact on the economics and politics of EVE’s universe: as both side scramble to rearm and rebuild, the price of in-game resource tritanium is starting to rise. “This sort of conflict,” Coker said, “is what science fiction warned us about.”
  2. Google Now Has an AI Ethics Committee (HufPo) — sorry for the HufPo link. One of the requirements of the DeepMind acquisition was that Google agreed to create an AI safety and ethics review board to ensure this technology is developed safely. Page’s First Law of Robotics: A robot may not block an advertisement, nor through inaction, allow an advertisement to come to harm.
  3. Academic Torrentsa scalable, secure, and fault-tolerant repository for data, with blazing fast download speeds built on BitTorrent.
  4. Hack Schools Meet California Regulators (Venturebeat) — turns out vocational training is a regulated profession. Regulation meets disruption, annihilate in burst of press releases.
Four short links: 21 January 2014

Four short links: 21 January 2014

Mature Engineering, Control Theory, Open Access USA, and UK Health Data Too-Open?

  1. On Being a Senior Engineer (Etsy) — Mature engineers know that no matter how complete, elegant, or superior their designs are, it won’t matter if no one wants to work alongside them because they are assholes.
  2. Control Theory (Coursera) — Learn about how to make mobile robots move in effective, safe, predictable, and collaborative ways using modern control theory. (via DIY Drones)
  3. US Moves Towards Open Access (WaPo) — Congress passed a budget that will make about half of taxpayer-funded research available to the public.
  4. NHS Patient Data Available for Companies to Buy (The Guardian) — Once live, organisations such as university research departments – but also insurers and drug companies – will be able to apply to the new Health and Social Care Information Centre (HSCIC) to gain access to the database, called care.data. If an application is approved then firms will have to pay to extract this information, which will be scrubbed of some personal identifiers but not enough to make the information completely anonymous – a process known as “pseudonymisation”. Recipe for disaster as it has been repeatedly shown that it’s easy to identify individuals, given enough scrubbed data. Can’t see why the NHS just doesn’t make it an app in Facebook. “Nat’s Prostate status: it’s complicated.”