Free The Facts: Critical Issue, Killer Presentation

Dave Gray’s Free The Facts presentation is a must-read, must-share for anyone who cares about either science or open access.

It’s also a masterpiece of presentation economy, and a fantastic demonstration of how to make a text-heavy presentation into something magical. Reminiscent of the work of Michael Wesch. (It’s also a fascinating demonstration of the convergence of YouTube, Flickr, and Slideshare as communication and teaching tools, and a foretaste of the generational change that the New York Times hinted at a few weeks ago.

tags: , ,
  • So glad you included the link to the Michael Wesch YouTube production. Culturally, we tend to dismiss the communication styles of the emerging youth as “stages” they’re moving through. It’s now a horrific mistake to do that. The convergence of technology, education, and social connection are changing the paths young people can take. To view technology as a distraction is to stick our heads in the sand.

  • What a fantastic and compelling piece of work. It’s probably the best visual argument I’ve ever seen. But, something is nagging at me.

    The scientific method, which Gray summarizes very well, is being overtaken by something new. When data was really expensive, a thesis was necessary in advance of measurement. The entire foundation of science is a sort of cost savings approach because lab time and experimentation were so incredibly expensive.

    We needed the approach that Gray outlines to make any sort of progress. The corruption of scientific publishing comes at a time when alternatives to the method are emerging

    Although it’s certainly not true yet for all aspects of science, measurement is now embedded in many things and activities. In those cases, an alternative to scientific method is required and becoming formalized.

    In the world of “big data”, you look at the data first and then develop a thesis to explain it. Science was designed as a deductive process (generalize followed by testing) in a time when measurement was scarce. Inductive thinking (generalizing from the measurements up)is productive when things measure themselves.

    We’re just learning about the new methods; it’s the dawn of an era built on the fruits of scientific method but headed in a slightly different direction. Gray exposes an opportunity for the development of knowledge and wisdom outside of the gates of the traditional scientific empire. It’s another disintermediation story.

  • Chris

    Great presentation. Another way to free the facts is to use HTML instead of embedding text in images.

    But any additional pressure to stop journals charging is good.

  • I couldn’t agree more with “freeing the articles”, and said so in a recent blog post:

    That post was motivated when Nature made some of its catalog of journals available free.

    I can’t tell how many times I’ve been researching something only to be vectored to Springer Verlag or some other extremely expensive source to pay for some public domain research.

    I’m so annoyed with it, I just won’t join any of the societies that practice this kind of thing.



  • Tim, perfect follow on from your last post. I am a great believer in open access and have been a reader of PLOS ever since it started. I also support the requirement to ensure that the underlying data is publicly available too, rather than a request of the author.

    Most scientific journals are still in the information dark ages. A paper is submitted, a private review is done to determine whether it is publication worthy, it is read and maybe a letter or too may be published attacking or correcting it. In the medical world, many of these papers turn out to be incorrectly analyzed (~30% in a recent study), sometimes they may even be subsequently retracted. Exposure to a lot more eyeballs than the peer reviewers would be very helpful and this fits in well with the idea of “harnessing collective intelligence”. Rating papers would be a huge help in making selections, rather than using citation indexes.

    Open access should also improve one other part of the process – citations. Many references in a paper may not be relevant, never read by the author[s] but included because they are believed important. Rapid access to these references would aid understanding of the paper and possibly reduce the number included if the ratings for each reference could be accessed too.

  • Access to research shouldn’t be our only goal. I know many people who are aware of the access problem, but have stopped submitting to PLOS because they know their articles will end up in Pubmed Central thanks to the NIH policies. These people lose out on remaining the copyright owners on their paper and on the creative commons license that allows their figures to be “remixed”. I can no longer easily use images from the articles to blog about them since I now technically need to contact the publisher and request permission. I worry that the access goal line is too close and many people are losing sight of the true free culture.

  • mlvlvr

    This is interesting, and might have been even better with a soundtrack by Sheryl Crow, and I have never used flickr in this manner, and I will do so after having read this, but is the story as one sided as that?

    Does that fact that something was created with public funds mean that it should be free? For instance, should pharmaceuticals, just because public funds paid for some initial research, be free? Or, do the costs of publishing, and adding editorial content and advocacy, as does the AAAS publication SCIENCE, have to be paid for additionally?

    Or, is the problem like that of Windows and OSX? Some argue that Windows has far more apps available for it than OSX, so OSX users are compromised. But as an attendee at an OReilly Conf would attest, almost everything you could want is available for OSX, so Windows just offers more flavors but no more nutrients, so to speak. Why are there 24,000 publications? What gave rise to that?

  • Milver

    Interesting topic…I still think that journals do add value to the product by having a QC system, protocols, and accountability well defined. We can eliminate the middle man, but we would still have that task to be performed by someone. I do prefer to cite a journal and trust its QC system and reputation than trying to disprove every fact that comes to my attention. Maybe the tax-payers funding system for research should actually be extended to cover these areas and thus have the cycle complete for the community.

  • John Sumser –

    Not sure I agree that big data changes the scientific method. Chris Anderson raised this issue a few months back in Wired.

    But this assumes that patterns just jump out of the data without any hypotheses, and that’s just wrong.

    Consider what web 2.0 teaches us about big data: Google has lots of hypotheses about which algorithms will produce “better” search results. They test these algorithms, and measure the results. Hmmm…the scientific method.

    And even the “big data” we have is no bigger than the data we had earlier in the evolution of science, from our own high bandwidth senses. It’s relatively recently that we had to design experiments just to acquire data. We had to design experiments not to acquire data, but to test hypotheses against data we already had, at least implicitly.

    In my own digging into big data sets, I’m always rooting around with a hypothesis in mind, trying to see if there is data that matches. Sometimes things jump out that I wasn’t expecting, but it’s usually in the course of a more directed search. As Pasteur said, “Fortune favors the prepared mind.”

    What is true is that we can gather a lot more data (and share it more widely) so that scientists can perhaps be exposed to some data that was generated without special experiments.

  • Thank you Tim and everyone for the kind words and thoughtful comments.

    I think when reading something like this it’s important to understand the author’s intent and so I’ll explain a bit of that.

    Tim, you also mention the use of an unusual medium (Flickr) and in fact Flickr was not only the presentation medium but was the source of the whole idea, which did not simply spring forth from my mind but was (and still is) part of an ongoing, iterative creation process that parallels agile development in many ways.

    This little essay (for lack of a better word) was not really a planned thing but kind of emerged when I posted a simple napkin sketch to Flickr. It was a very rough sketch of ideas that were not fully formed but emerging from some reading I had been doing. You can see the sketch here.

    A few people commented and soon a vigorous discussion emerged — a debate about the nature of facts and belief. That comment thread sparked a lot more thoughts and conversations for me, both online and off — and a lot more reading.

    As I sat down to think about these issues I found that sketching them visually came more naturally than writing. This idea of representing a fact as a building block of uncertain stability came in a dream, for example.

    So I sketched the ideas and put them in sequence and put them online as a Flickr slideshow.

    Now each “slide” in the Flickr set can generate its own comment thread for those who are interested in that topic.

    Also, as commenters point out flaws or suggest improvements, or more if I have more ideas, I can replace slides or add new slides like these to improve the story; that is, the story can evolve based on the ongoing discussion.

    I’d like to encourage anyone who has ideas that could improve this presentation to add their comments, thoughts and ideas to the Flickr photo pages — I hope this essay can continue to evolve and improve from criticism and discussion, just as any scientific theory does.

    To address a few comments above:

    John Sumser: I think these are interesting thoughts and encourage you to join some of that Flickr conversation. I imagine that access to large pools of data might be just as complicated and problematic as these other areas and would like to find a way to incorporate that.

    Chris: I didn’t “embed” the text in the images, I just didn’t spend extra time to “extract” the text and re-write it so the internet could read it. It just so happens that the internet isn’t smart enough to read my handwriting.

    mlvlvr: I’m not suggesting pharmaceuticals should be free, but the research. I’m not talking about privately funded research or proprietary information here, but research that has already been paid for and published to the scientific community. If you choose to take public money you can’t expect it to come with no constraints.

    Milver: I might be naive, but it seems like the Universities already have QC systems like peer-review for things like tenure, and that they could institute similar systems at a far lower cost than the fees they pay the journals.

    Thanks again Tim and all, for the kind words.


  • Regarding “But this assumes that patterns just jump out of the data without any hypotheses, and that’s just wrong.”: About 15 years ago I decided to see if I could make money by writting a program that provided nightly analysis of the previous days stock transactions along with two years of daily history for hundreds of NASDAQ stocks. I researched technical, fundamental and other perspectives before developing an algorithm for this. I tested and finally deployed a program that sent me emails in the wee hours of the night with the top few dozen rated stocks based on my algorithm. I a good return on my $5K experimental investment for a few months until the free data source went away and I also had second thoughts regarding the ethics of day trades. The algorithm was not based on traditional technical or fundamental approaches that start with a streamlined hypothesis and then try to test it. Instead, it looked for the most predictable stocks and tested for profitability using a combination of how predictable it was and the financial penalty of being wrong.

    You could certainly make an argument that I had a hypothesis that an algorithm could find predictable stocks, but I think this is one level of abstraction above the usual assumption. By questioning the need for a hypothesis at the usual level, I’ve found nice rewards in other fields of interest. Take a look:

  • Roger Weeks

    It’s an interesting presentation, but like all text-heavy presentations, it’s nearly impossible to read.

    Why don’t people understand that small fonts, whether handwritten or typed, are completely useless for presentations? Realistically anything under about a 16 point font on a projector isn’t going to be readable by a large percentage of any given audience.

    The cuteness of the handwriting makes it somewhat more bearable, but it still fails the legibility test for me.

  • Hi Roger,

    I fully agree with you. If you’re looking at the embedded version above, the type seems quite small. However if you click the link that Tim gives in his post (or click here the images will play as large as your screen will allow.

    Unfortunately there are so many devices out there that it’s hard to make something that will work for every monitor. But if I run the slideshow at full screen on my little HP laptop, the text is easily 16 point or more.

    When I started this little project it was more of a personal exploration, a way to think through the issues involved. One of the reasons I posted them on Flickr is because Flickr allows people to view the images quite large and also print them if they so desire.

    I do think that this kind of hybrid, textual-visual form probably works better when encountered as printed pages or cards than it does when viewed online. So far Flickr seems to have the most flexibility of anything I’ve tried.

  • Linda

    First, thanks, Tim, for introducing me to Michael Wesch’s work. Better late than never.

    I’ll follow the author’s suggestion and post to Flickr my comment about his incredulous statement, “Many scientists don’t realize that their research isn’t freely available to the public.”

  • Barry Rowlingson

    Wouldn’t it be great if we had some kind of building which kept copies of all these non-free articles, and for a small fee (plus some state support, because this would be a fine social enterprise of benefit to all) allowed people to have a look at the facts for a while, and then give them back.

    Now it would have to be a very large building, so instead let’s put the facts in lots of smaller buildings, or warehouses, and allow everyone to request the facts they want. For a small fee, perhaps, to handle the admin and shipping.

    Yes, its called a library. You can go to the British Library and request any published book or journal ever. And the only mention of prices I can find on the British Library web site are for souvenirs and trinkets at the gift shop.

    Not that free online access to journals wouldn’t be marvellous, but it’s disingenuous the imply that “the facts” are being kept away from people without cash or research contacts.

  • Andrew

    Peer review isn’t infallible though. Watch out for the ‘cello scrotum’ effect:

    The other issue is that of who pays for it? Do we pay for it out of general taxation? If so, how? Through general taxation? Any peer reviewed journal or just ‘sciency’ ones? Humanities, Philosophy, etc where ‘facts’ are more nebulous.

  • Barry: “Yes, its called a library. You can go to the British Library and request any published book or journal ever. “

    Yes I remember libraries. I used them quite a lot in the pre-electronic days. Back in the 1970’s, it took forever to work your way through citation indexes, locate the journal article, peruse it quickly and photocopy it for further reading later. At the time I well recall that the science establishment was worrying about the ability of scientists to even determine if their research was new given how long it took to work your way through the journal stacks. Requesting an article copy is OK if you know exactly what you want, but the hard part, working your way to finding what you want requires electronic access and usually more than the abstracts that are provided.

  • Kevin,

    Without knowing anything about your algorithm, it’s hard to assume anything but that it is an encoding of a hypothesis about what you expect to find meaningful in the data. No hypothesis, no algorithm, no result.

    Google is nothing but algorithms against massive data, but they create algorithms to produce results that people are looking for. There is surprise in individual results (just as there are surprises in scientific experiments) but it’s all framed by an underlying set of experimental goals.