"opensource" entries

Four short links: 4 January 2010

Four short links: 4 January 2010

Code for Speed, Wooden Locks, Font Design, and a Java Distributed Data Store

  1. Why Git Is So Fast — interesting mailing list post about the problems that the JGit folks had when they tried to make their Java version of Git go faster. Higher level languages hide enough of the machine that we can’t make all of these optimizations. A reminder that you must know and control the systems you’re running on if you want to get great performance. (via Hacker News)
  2. Wooden Combination Lock — you’ll easily understand how combination locks work with this find piece of crafty construction work.
  3. From Moleskine to Market — how a leading font designer designs fonts. Fascinating, and beautiful, and it makes me covet his skills.
  4. Terrastore — open source distributed document store, HTTP accessible, data and queries are distributed, built on Terracotta which is built on ehcache (updated: Terracotta has an ehcache plugin, but isn’t built on ehcache). A NoSQL database built on Java tools that serious Java developers respect, the first such one that I’ve noticed (update: I brain-farted: neo4j was definitely on my radar). Notice that all the interesting work going on in the NoSQL arena is happening in open source projects.
Four short links: 30 December 2009

Four short links: 30 December 2009

Time Management, CS Education, Installing EtherPad, Infoengravings

  1. How to Run a Meeting Like Google (BusinessWeek) — the temptation is to mock things like “even five minute meetings must have an agenda”, but my sympathy with Marissa Mayer is high. The more I try to cram into a work day, the more I have to be able to justify every part of it. If you can’t tell me why you want to see me for five minutes, then I probably have better things to be doing. There may be false culls (missing something important because the “process’ is too high) but I bet these are far outweighed by the missed opportunities if time isn’t so structured.
  2. Computer Science Education WeekDecember 5-11, 2010, recognizes that computing: Touches everyone’s daily lives and plays a critical role in society; Drives innovation and economic growth; Provides rewarding job opportunities; Prepares students with the knowledge and skills they need for the 21st century.” Worthy, but there’s no mention of the fact that it’s FUN. The brilliant people in this field love what they do. They’re not brilliant 9-5, then heading home to scan the Jobs Wanted to see whether they could earn more as dumptruck drivers in Uranium mines in Australia. CS isn’t for everyone, but it won’t be for anyone unless we help them find the bits they find fun.
  3. Installing EtherPad — step-by-step instructions for installing EtherPad, the open-source real-time text editor recently acquired by Google.
  4. Victorian Infographics — animals, time, and space from the Victorians. It’s beautiful, it’s meaningful, it must be infoengravings.
Four short links: 29 December 2009

Four short links: 29 December 2009

Historic Science, Troll Psych, Open Access, Programming Python Games

  1. Turning The Page Online — historic science books in high-resolution online. Hookes Micrografia was the first view of the microscopic world, and his astonishingly detailed and beautiful illustrations are there to view and print.
  2. Detailed Psychology of TrollsYou might be surprised to learn that Trolls readily engage in long debates with fellow Trolls – people, that is, whom they know to be perverse and cunning conversation hackers. Apparently, this does not detract them from wasting hours on fruitless debates that are blatantly rigged and full of sophistry. Few Trolls would be happy with debating only fellow Trolls (semi-literate teenagers and hard-boiled fundamentalists are so much tastier – even though they, too, might be trolling you). Yet most of them, every once in a while, enjoy having an absurd argument with another pig-head. Good on the “know your enemy” basis. (via MindHacks)
  3. Theme Issue — a Royal Society publication ran a special open access issue focusing on “personal perspectives of the life sciences”, where top scientists write about what they think is important. It’s good to see more toes dipped into open access, but I’d love to see more journals (particularly those of professions and associations) move to an entirely open access model. (via SciBlogs)
  4. Invent Your Own Computer Games with Python (2ed) — free ebook that teaches how to program in Python, using games as the motivating examples. Nominally for 10-12 year old children, but (naturally) accessible to adults too. I have not read it, but approve of the attempt.
Four short links: 18 December 2009

Four short links: 18 December 2009

Ethics, Parallel Matrices, Browser Math, and Open Source EtherPad

  1. In Character — a journal that addresses a different virtue each quarter. I’ve been thinking of practical philosophy a lot, lately, as we see ever-more-dodgy behaviour. (via bengebre on Delicious)
  2. Lessons from Parallelizing Matrix Multiplication — a reminder why low-level knowledge of your platform matters, and why motivating examples should be carefully chosen.
  3. MathJaxMathJax is an open source, Ajax-based math display solution designed with a goal of consolidating advances in many web technologies in a single definitive math-on-the-web platform supporting all major browsers. (via Hacker News)
  4. EtherPad Source — released as part of their Google acquisition. The announcement says: Our goal with this release is to let the world run their own etherpad servers so that the functionality can live on even after we shut down etherpad.com. This is the resolution to the bad reception of the news that EtherPad would close in March with no plan B for users. The cult of entrepreneurship worshipped the customers only as a vehicle to an exit, but I don’t believe that it’s moral to do well personally but leave your customers high and dry. This is a message that the EtherPad founders seem to have got loud and clear.
Four short links: 15 December 2009

Four short links: 15 December 2009

Open Source Imagery Analysis, GPL Lawsuits, Small World, Regina v Internet

  1. OpticksOpticks is an expandable remote sensing and imagery analysis software platform that is free and open source. Hugely extensible system. (via geowanking)
  2. Best Buy, Samsung, And Westinghouse Named In SFLC Suit Today (Linux Weekly News) — the Software Freedom Law Center is suing them for selling GPL-derived products without offering the source. They’ve been unresponsive when contacted outside the legal system.
  3. Twitter Helps Reunite Owner with Camera — Kiwi blogger saw camera fall from car in front of him, posted a picture from the camera to his blog and asked “anyone recognize someone from this picture?”. How long do you think it took to get a hit? I love that New Zealand is a village with a seat at the UN.
  4. R vs The Internet — seminar held in New Zealand about the effects of the online world on law, including matters of suppression and contempt. See session notes from TechLiberty and video of the sessions from R2.
Four short links: 10 December 2009

Four short links: 10 December 2009

Open Source CMS and OPAC, Timely SQL, A Bid Secret, Basic Research

  1. Scriblio — open source CMS and catalogue built on WordPress, with faceted search and browse. (via titine on Delicious)
  2. Useful Temporal Functions and Queries — SQL tricksies for those working with timeseries data. (via mbiddulph on Delicious)
  3. Optimal Starting Prices for Negotiations and Auctions –Mind Hacks discussion of a research paper on whether high or low initial prices lead to higher price outcomes in negotiations and online auctions. Many negotiation books recommend waiting for the other side to offer first. However, existing empirical research contradicts this conventional wisdom: The final outcome in single and multi-issue negotiations, both in the United States and Thailand, often depends on whether the buyer or the seller makes the first offer. Indeed, the final price tends to be higher when a seller (who wants a higher price and thus sets a high first offer) makes the first offer than when the buyer (who offers a low first offer to achieve a low final price) goes first.
  4. WiFi Science History — Australian scientist studies black holes in the 70s, has to develop a way of piecing together signals that have been distorted as they travel through space. Realizes, when he starts playing with networked computers in the late 80s, that this same technique would let you “cut the wires”. A decade later it emerged as a critical part of wireless networking. As Aaron Small says, it shows the value of basic research, where you don’t have immediate applications in mind and can’t show short-term deliverables or an application to a current high-value problem.
Four short links: 9 December 2009

Four short links: 9 December 2009

Bioinformatics Myths, Internet Policy, Archivist Tools, Life Visualisations

  1. The Mythology of Bioinformatics — worth reading this (reprinted from 2002!) separate of hype from history.
  2. Policy and Internet — new journal, with articles such as The Case Against Mass E-mails: Perverse Incentives and Low Quality Public Participation in U.S. Federal Rulemaking: This paper situates a close examination of the 1000 longest modified MoveOn.org-generated e-mails sent to the Environmental Protection Agency (EPA) about its 2004 mercury rulemaking, in the broader context of online grassroots lobbying. The findings indicate that only a tiny portion of these public comments constitute potentially relevant new information for the EPA to consider. The vast majority of MoveOn comments are either exact duplicates of a two-sentence form letter, or they are variants of a small number of broad claims about the inadequacy of the proposed rule. This paper argues that norms, rules, and tools will emerge to deal with the burden imposed by these communications. More broadly, it raises doubts about the notion that online public participation is a harbinger of a more deliberative and democratic era. (via Jordan at InternetNZ)
  3. Xena — GPL-licensed Java software from National Archives of Australia, to detect the file formats of “digital objects” and then converting them into open formats for preservation.
  4. Nebul.us — startup that aggregates and visualises your online activity. In private beta, but there’s a screenshot and brief discussion on Flowing Data.
Four short links: 8 December 2009

Four short links: 8 December 2009

Python Moratorium, Math Pictures, Assemblers Needed, Tennis Vision

  1. Python’s Moratorium — Python language designers have declared a moratorium on enhancement proposals (feature requests) while the world’s Python programmers get used to the last batch of New And Shiny they shipped. I’m reasonably sure that the ALGOL designers went through exactly the same discussions, and I know Perl did too. So, don’t be afraid of it – don’t think that Python is evolutionarily dead – it’s not. We’re taking a stability and adoption break, a breather. We’re doing this to help users and developers, not to just be able to say “no” to every random idea sent to python-ideas, and not because we’re done. Reminds me of Perl god Jarkko Hietaniemi’s signature file: “There is this special biologist word we use for ‘stable’. It is ‘dead’. — Jack Cohen.
  2. This Week’s Finds in Mathematical Physics — I can’t meaningfully contribute to the math, but golly them pictures are purty! (via Hacker News)
  3. x86 Assembly Encounter To use a construction industry metaphor, an average x86 assembler has the complexity and usefulness of a hammer, while the DSP world is using high-speed mag-rail blast-o-matic nail guns with automatic feeders and superconducting magnets. […] I find it ridiculous that the most popular computing platform in the world does not have a decent assembler. What’s even worse, from the discussions I’ve seen on the net, people are mostly interested in how fast the assembler is (?!) rather than how much time it saves the programmer. (via Hacker News)
  4. Finding Tennis Courts in Aerial Photos — more hacking with computer vision techniques and publicly-available data. This is going to lead to good things (and some unpleasant surprises, as that which was formerly “too hard to find” ceases to be so). (via Simon Willison)
Four short links: 7 December 2009

Four short links: 7 December 2009

Touchscreen++, Data Analysis, Open Science and Social Software, Google Makes Good

  1. 3D Touchscreens — Japan Science & Technology Agency and researchers at the University of Electro-communications have made a “photoelastic” touch screen. The LCD emits polarized light, picked up by a camera over the screen. Transparent rubber on the screen deforms when pressed, and the camera can pick this up. Interesting hack, though it’s not yet a consumer-grade product.
  2. Eureqa — open source tool for detecting equations and hidden mathematical relationships in your data. Its primary goal is to identify the simplest mathematical formulas which could describe the underlying mechanisms that produced the data. (via pigor on delicious)
  3. Science in the Open, It Wasn’t Supposed To Be This Way — Cameron Neylon on the leaked climate email messages as a trigger for open data. One of the very few credible objections to open research that I have come across is that by making material available you open your inbox to a vast community of people who will just waste your time. The people who can’t be bothered to read the background literature or learn to use the tools; the ones who just want the right answer. […] my concern is that in a kneejerk response to suddenly make things available no-one will think to put in place the social and technical infrastructure that we need to support positive engagement, and to protect active researchers, both professional and amateur from time-wasters. Sounds like an open science call for social software, though I’m not convinced it’s that easy. Humans can’t distinguish revolutionaries from terrorists, it’s unclear why we think computers should be able to.
  4. EtherPad Back Online Until Open Sourced — Google bought collaborative real-time EtherPad and the team will work on Google Wave, but the transition plan was “you can’t create more documents, and it’ll all go away in March”. Grumpiness ensued. Everyone makes mistakes online, but the secret is to listen, acknowledge the mistake, and correct your course.
Four short links: 1 December 2009

Four short links: 1 December 2009

Open Source Cinema Camera, Collaborative Filtering, Message Queue for Replication, Facebook Data Warehouse Numbers

  1. Apertus — open source cinema camera. (via joshua on Delicious)
  2. A Survey of Collaborative Filtering TechniquesFrom basic techniques to the state-of-the-art, we attempt to present a comprehensive survey for CF techniques, which can be served as a roadmap for research and practice in this area. (via bos on Delicious)
  3. Drizzle Replication using RabbitMQ as Transport — we’re watching the growing use of message queues in web software, and here’s an interesting application. (via sogrady on Delicious)
  4. Facebook Data Team: Distributed Data Analysis at Facebook — job ad from Facebook gives numbers on company use of their Hive data warehouse tool built on top of Hadoop: Today, Facebook counts 29% of its employees (and growing!) as Hive users. More than half (51%) of those users are outside of Engineering. They come from distinct groups like User Operations, Sales, Human Resources, and Finance. Many of them had never used a database before working here. Thanks to Hive, they are now all data ninjas who are able to move fast and make great decisions with data. (via Simon Willison)