Why files need to die

Files are an anachronism in the digital age. It's time for something better.

Filing Cabinet by Robin Kearney, on FlickrFiles are an outdated concept. As we go about our daily lives, we don’t open up a file for each of our friends or create folders full of detailed records about our shopping trips. Create, watch, socialize, share, and plan — these are the new verbs of the Internet age — not open, save, close and trash.

Clinging to outdated concepts stifles innovation. Consider the QWERTY keyboard. It was designed 133 years ago to slow down typists who were causing typewriter hammers to jam. The last typewriter factory in the world closed last month, and yet even the shiny new iPad 2 still uses the same layout.
Creative alternatives like Dvorak and more recently Swype still struggle to compete with this deeply ingrained idea of how a keyboard should look.

Today we use computers for everything from booking travel to editing snapshots, and we accumulate many thousands of files. As a result, we’ve become digital librarians, devising naming schemes and folder systems just to cope with the mountains of digital “stuff” in our lives.

The file folder metaphor makes no sense in today’s world. Gone are the smoky 1970s offices where secretaries bustled around fetching armfuls of paperwork for their bosses, archiving cardboard files in dusty cabinets. Our lives have gone digital and our data zips around the world in seconds as we buy goods online or chat with distant relatives.

A file is a snapshot of a moment in time. If I email you a document, I’m freezing it and making an identical copy. If either of us wants to change it, we have to keep our two separate versions in sync.

So it’s no wonder that as we try and force this dated way of thinking onto today’s digital landscape, we are virtually guaranteed the pains of lost data, version conflicts and failed uploads.

It’s time for a new way to store data – a new mental
model that reflects the way we use computers today.

OSCON Data 2011, being held July 25-27 in Portland, Ore., is a gathering for developers who are hands-on, doing the systems work and evolving architectures and tools to manage data. (This event is co-located with OSCON.)

Save 20% on registration with the code OS11RAD

Flogging a dead horse

Microsoft, Apple and Linux have all failed to provide ways to work with our data in an intuitive way. Many new products have emerged to try and ease our pain, such as Dropbox and Infovark, but they’re limited by the tired model of files and folders.

The emergence of Web 2.0 offered new hope, with much brouhaha over folksonomies. The idea
was to harness “people power” by getting us to tag pictures or websites with meaningful labels, removing the need for folders. But Flickr and Delicious, poster boys of the tagging revolution, have fallen from favor and as the tools have stagnated and enthusiasm for tagging has dwindled.

Clearly, human knowledge is needed for computers to make
sense of our data – but relying on human effort to digitize that knowledge by labeling files or entering data can only take us so far. Even Wikipedia has vast gaps in its coverage.

Instead, we need computers to interpret and organize data
for us automatically. This means they’ll store not only our data, but also
information about that data and what it means – metadata. We need them to really understand our digital information as something more than a set of text documents and binary streams. Only then will we be freed from our filing frustrations.

I am not a machine, don’t make me think like one

In all our efforts to interact with computers, we’re forced to think like a machine: What device should I access? What format is that file? What application should I launch to read it? But that’s not how the brain works. We form associations between related things, and that’s how we access our memories:

Associative recall in the brain

Wouldn’t it be nice if we could navigate digital data in this way? Isn’t it about time that computers learned to express the world in our terms, not theirs?

It might seem like a far-off dream, but it’s achievable. To do this, computers will need to know what our data relates to. They can learn this by capturing information automatically and using it to annotate our data at the point it is first stored — saving us from tedious data entry and filing later.

For example, camera manufacturers have realized that adding GPS to cameras provides valuable metadata for each photograph. Back at your PC, your geo-tagged images will
be automatically grouped by time and location with zero effort.

Our digital lives are full of signals and sensors that can be similarly harnessed:

  • ReQall
    uses your calendar and to-do list activity to help deliver information at the
    right time.
  • RescueTime tracks the websites and programs you use to understand your working habits.
  • Lifelogging projects like MyLifeBits go further still,
    recording audio and video of your life to provide a permanent record.
  • A research project at Ryerson University demonstrates the idea of context-aware computing — combining live, local data and user information to deliver highly relevant, customized content.

Semantics: Teaching computers to understand human language

Metadata annotation via sensors and semantic annotation

As this diagram shows, hardware and software sensors can
only tell half the story. Where computers stand to learn the most is by analyzing
the meanings behind the 1s and 0s. Once computers understand our language, our documents and correspondence are no longer just isolated files. They become source material, full of facts and ready to be harvested.

This is the science of semantics — programs that can extract meaning from the written word.

Here’s some of what we can do with semantic technology
today:

Today, most semantic research is done by enterprises that can
afford to spend time and money on enterprise content management (ECM) and content analytics systems to make
sense of their vast digital troves. But soon consumers will reap the benefits
of semantic technology too, as these applications show:

  • While surfing the web, we can chat and interact around particular movies, books or activities using the browser plug-in GetGlue, which scans the text in the web pages you visit to identify recognized social objects.
  • We will soon have our own intelligent agents, the first of which is Siri, an iPhone app that can book movie tickets or make restaurant reservations without us having to fill in laborious online forms.

This ability for computers to understand our content is critical as we move toward file-less computing. A new era of information-based applications is beginning, but its success requires a world where information isn’t fragmented across different files.

Time for a new view of data

Let’s use your summer vacation as an example: All the digital information relating to your vacation is scattered across hundreds of files, emails and transactions, often locked into different applications, services and formats.

No matter how many fancy applications you have for “seamlessly syncing” of all these files, any talk of interoperability is meaningless until you have a basic fabric for viewing and interacting with your data at a higher level.

If not files, then what? The answer is surprisingly simple.

What is the one thing all your data has in common?

Time.

Almost all data can be thought of as a stream, changing over time:

The streams of my digital life

Already we generate vast streams of data as we go about our lives: credit card purchases, web history, photographs, file edits. We never get to see them on screen like that though. Combining these streams into a
single timeline — a personal life stream — brings everything together in a way that makes sense:

A personal life stream

Asking the computer “Show me everything I was doing at 3 p.m. yesterday.” or “Where
are Thursday’s figures?” is something we can’t easily do today. Products such as AllOfMe are beginning to experiment in this space.

We can go further — time itself can be used to help associate things. For example: Since I can only be in one place at one time, everything that happens there and then must be related:

All data at the same time is related

The computer can easily help me access the most relevant information — it just needs to track back along the streams to the last time I was at a certain place or with a specific person:

Related data can be found by finding previous occurrences on each stream

The world — our lives — is interconnected, and data needs to be the same.

This timeline-based view of data is useful, but it becomes even more powerful when combined with the annotations and semantic metadata gathered earlier. With this much cross-linking between data, our information can now be associated with everything it relates to, automatically.

Finally, we can do away with files because we have a system that
works like the brain does – giving us another new power — to
traverse effortlessly from one related concept or entity to another until we
reach the desired information:

Associative data navigation

In a system like this we navigate based on what the data means to us – not which file it is located in.

There will be technical challenges in maintaining data that resides on different devices and is held by different service providers, but cloud computing industry giants like Amazon and Google have already solved much more difficult problems.

A world without files

In the world of linked data and semantically indexed information, saving or losing data is not something we’ll have to worry about. The stream is saved. Think about it: You’d never have to organize your emails or project plans because everything would be there, as connected as the thoughts in your head. Collaborating and sharing would simply mean giving other people access to read from or contribute to part of your stream.

We already see a glimpse of this world when we look at Facebook. It’s no wonder that it’s so successful; it lets us deal with people, events, messages and photos — the real fabric of our everyday lives — not artificial constructs like files, folders and programs

Files are a relic of a bygone age. Often, we hang onto ideas long past their due date because it’s what we’ve always done. But if we’re willing to let go of the past, a fascinating world of true human-computer interaction
and easy-to-find information awaits.

Moving beyond files to associative and stream-based models will have profound implications. Data will be traceable, creators will be able to retain control of their works, and copies will know they are copies. Piracy and copyright debates will be turned on their heads, as the focus shifts from copying to the real question of who can access what. Data traceability could also help counter the spread of viral rumors and inaccurate news reports.

Issues like anonymity, data security and personal privacy will require a radical rethink. But wouldn’t it be empowering to control your own information and who can access it? There’s no reason why big corporations
should have control of our data. With the right general-purpose operating system that makes hosting a piece of data, recording its metadata and managing access to it as easy as sharing a photo on Facebook, we will all be empowered to embrace our digital futures like never before.

Photo: Filing Cabinet by Robin Kearney, on Flickr

Related:

tags: , ,
  • Paulo

    Well, it seems from a technical point of view, files are as valid a metaphor as ever for describing how the computer should manage a piece of atomic information. Only after reading through your article it becomes clear that files need to die as a human-computer interaction metaphor, and not as a technical abstraction for the OS to manipulate an atomic piece of data. Very good article, though the title is a bit too catchy for its own sake.

  • http://derekarnold.net Derek Arnold

    This article sounds really silly if instead of saying ‘files’ you say ‘discrete units of data’, because that’s all that files are. What a troll.

  • jwoodruff

    You have two errors in your lede-in. The QWERTY keyboard was not designed to ‘slow down typists,’ the physical layout of the keys – the off-set rows, rather than perfectly aligned rows – is what prevented letters from jamming. See this well-cited wikipedia article: http://en.wikipedia.org/wiki/QWERTY

    Additionally, according to NPR, the last typewriter factory in the world did not close. It’s possible the last MANUAL typewriter factory in the world closed, but companies like Brother are still selling typewriters.

    Make sure you have your facts straight before you make broad generalizations next time. Or, better, just don’t make broad generalizations at all.

  • Luis Rojas

    This is a very interesting take on the metaphor to handle discrete data. I think you miss a couple of things like how should i store a letter i sent to my friend or how do i handle my music. the bits still need to be stored somewhere in some form of another and that’s what a file is… just a way to group the bits together. Now, the filesystem, the way we organized this groupings of discrete data, i agree that needs to change.

    Good article ovarall

  • http://alexbowyer.com/ Alex Bowyer

    Thanks for the comments.

    Paulo: You’re right that the main concern is how the information is represented to the user.. but I actually think there is a case for replacing files at the technical level too. Microsoft tried to do this back in 2003, for what was then called Longhorn (later Windows Vista), when they created WinFS. It was a filesystem based on a relational database instead of discrete files, to solve exactly this problem:
    http://en.wikipedia.org/wiki/WinFS
    There are other examples of attempts to design systems which don’t deal in files.. Apple’s iCloud may be the latest example, although I am not 100% how that works under the cover.

    jwoodruff: There are certainly question marks over the exact history of QWERTY’s design, but I did research it. Here are a few references:
    http://bit.ly/r3r4rH , http://bit.ly/qJiNh1 , http://bit.ly/nSs447 , http://bit.ly/oY2euv
    I wouldn’t get too hung up on the exact details. It’s simply a side point to illustrate how dated technology ideas make it hard for new, more efficient solutions, to take hold.
    I wasn’t familiar with the NPR findings about some typewriters still being produced (reference here: http://n.pr/q9UCia ) so thanks for sharing that. Again, it’s really not the main point of this article – typewriters *are* an old technology and they’re on the way out, that’s all that matters for the purpose of the example.

  • http://alexbowyer.com/ Alex Bowyer

    Luis, Derek,

    Yes, bits will need to be grouped together, and you’re right that it’s more the filesystem that’s the issue than the files themselves..

    What I’m really saying is that that maybe data shouldn’t be stored discretely in isolation – at least not without metadata or links to related content.

  • Joel

    I would agree that files are archaic and unintuitive with regard to human use and management. We have to manage our own directory structures, file names, and so on. However, even if everything were an ‘object’ with scalable metadata to contain all of the information we needed to recall a discrete piece of data from a certain point in time or related to something we needed to find, how would it be stored by the computer?

    It would still be a file. A bloated one, that is. Or a series of bloated files. With many ancillary files used for nothing more than storing relationships, patterns, and other things that tie one discrete piece of data to another.

    Perhaps what you’re looking for is a better system of managing information as a user, not reinventing how computers store data.

  • Michael Peters

    I agree, and while we’re at it lets get rid of bits and bytes too. Nobody uses them, we all use tweets, posts, hashtags and links.

    Come on, that’s just silly. If the current abstractions are too confusing for people then I’ll admit maybe it’s time to put other abstractions on top of those, but files and directories are fundamental building blocks for almost every computer program in existence even if in some of those cases (like most web software) that abstraction is hidden from the users.

  • http://alexbowyer.com Alex Bowyer

    Bits and bytes are fundamental of course.

    But “files are fundamental building blocks for every computer program existence” sounds a lot like “this is how we’ve always done it, so we should continue to do it that way” – which is kind of my point. It’s become deeply ingrained, when it needn’t be. New filesystems are produced all the time, and they need not be file-based (see WinFS example in commments above).

    I’m not saying there is no place for files at all – I am saying that we shouldn’t cling so tightly to the filesystem idea, we should be open to new models.

  • Stephen

    You make a solid argument. Being able to find and navigate information in something like your vacation network graph appeals to me a lot.

    (Personally, I’d hate having to wade through a timestream but can see how it would be useful in emergencies.)

    However, you really missed the boat on tagging. Tags haven’t gone out of favor; they’re just called different things. For example, one GMail’s best features is Labels (just tagging by a different name) which improves upon folders because a single item can exist in multiple categories.

    In fact, your network idea could be implemented by tagging. But, manual tagging is work. What we need is a system that recognizes context and intent and automatically tags accordingly. You tell the system you are working on Project X and all documents and events for that unfold before you. Any work you do is tagged with the Project X context along with other implied contexts (conversations with Sally, project tasks completed, fonts used etc.)

    We might want to revive an old idea from the 90’s: Active documents. Both Microsoft and Apple promoted the idea that documents (and document fragments) would be linked so that, say, a change to a spreadsheet over there would automatically be reflected in an updated pie chart in this document over here. You can do that still but the idea never took off because computing was too disconnected and file-based at the time.

    What we need is an OS that automatically ties together activities and documents and lets us navigate those relationships rather than folders.

  • http://radar.oreilly.com/edd Edd Dumbill

    Nice post, Alex.

    @Michael Peter, files are predominant primarily because of the accident of commercial dominance of certain operating systems. If, for instance, Oberon or Smalltalk systems had been successfully commercialised, we wouldn’t have had the whole distraction of files and folders that we do today.

    To Alex, I’d like to see you explore the broader story here, which is the continuing struggle of us as humans to find means of expressing ourselves through technology. I don’t think Facebook is any more suited than files and folders are, if I’m honest. Though streams seem a convenient metaphor, they too are not the way that humans really deal with information.

    I think you identify some key requirements. Associativity is certainly a key mechanism that we need. We also strongly need a workable social backbone, in order to effectively apply notions of provenance and community.

    But the key omission from this article, and to be honest a point I’m not entirely clear on myself, is what is the goal here? Is it simply self-expression, or is it something else? To evaluate technological solutions, we have to understand what the requirements are.

    You can’t assert that data needs to be interconnected like our lives, unless you state why we need the data systems in the first place. The most immediately plausible answer is that this is simply about the human need for self-expression, and we’re seeking technological means of accelerating and broadening what we already do naturally.

  • Alex Tolley

    It’s true, live long enough and watch as everything gets “reinvented”. David Gelernter proposed the time stream metaphor back at least in the early 1990’s. I wonder why it didn’t gain traction?

    Contrary to your assertions, the automated meta data/tagging and time stamping of information won’t reduce your cognitive load as much as you hope, it just changes it. The easiest way to understand this is to assume all your data is in a black box device with some sort of Google search method to locate it. Now you need to know the content to start the association links, and you will need to filter out the extraneous data to locate the few items that you need. You may also lose files because the associations are either incorrect or “forgotten”. Just so you could avoid a tidy filesystem. The irony is that the image you use to illustrate the idea is already pre-categorized…like a file system.

    There is a false logic in “Facebook is successful”, therefore we should follow their social model for work (and everything else). You need to think this through a lot more carefully.

    There are certainly major failings with filesystems, both physical and digital, but assuming that some system that mimics the way your brain handles the world will solve it, is just magical thinking.

  • http://alexbowyer.com/ Alex Bowyer

    Hi Edd, thanks for a very deep response!

    You are right that there is a really high level question here – What are computers for?

    Self-expression is important, but in the main I take the view that computers are here to be used as tools to make our lives easier – and as such, the computer should, over time, get better and better at serving me and require less and less instruction.

    We need to make the paradigm shift from isolated data pieces to linked data in order to really enable that kind of learning and augmentation of data to take place (because until then, where would the computer store the additional understanding).

    You could say it’s a bit of a HCI purist point of view – but I genuinely believe we can build computers that are a lot better at understanding us and servicing our needs than we have today.

  • http://alexbowyer.com/ Alex Bowyer

    @Alex Tolley:

    True, with every new technology we have to deal with it in new ways, and we’d probably always have to spend some time to get things right.

    I suspect we would spend less time tagging, and more time “training” our computers (Yes that is correct, you got that wrong, Never ever do X if Y happens, etc).
    In some ways it would be like teaching a child.

    I understand the point about magical thinking – clearly there is a vast amount of technology required to enable something like this – but for now I’m focussing on the vision. I am sure we could have many more posts exploring the technical ways to achieve something like this and the technical limitations to overcome.

  • Alex Tolley

    It’s true, live long enough and watch as everything gets “reinvented”. David Gelernter proposed the time stream metaphor back at least in the early 1990’s. I wonder why it didn’t gain traction?

    Contrary to your assertions, the automated meta data/tagging and time stamping of information won’t reduce your cognitive load as much as you hope, it just changes it. The easiest way to understand this is to assume all your data is in a black box device with some sort of Google search method to locate it. Now you need to know the content to start the association links, and you will need to filter out the extraneous data to locate the few items that you need. You may also lose files because the associations are either incorrect or “forgotten”. Just so you could avoid a tidy filesystem. The irony is that the image you use to illustrate the idea is already pre-categorized…like a file system.

    There is a false logic in “Facebook is successful”, therefore we should follow their social model for work (and everything else). You need to think this through a lot more carefully.

    There are certainly major failings with filesystems, both physical and digital, but assuming that some system that mimics the way your brain handles the world will solve it, is just magical thinking.

  • Dave Horrigan

    Great article. However it could be more simple. The mind stores data by time, subject and importance. BFF is a good example. Subject friend, importance best and forever(noe) time. Test this on all memories. Ask a question and the mind will go subject, importance time in a sequence scan for the answer. The memories will arrive most effectively if called up asking those three qualifiers.

  • http://www.icemark.com Chris Wild

    Interesting article Alex, but one that I feel may be flawed by the perception of how people make associations and how they need to access their data. I am sure that much of what you said is true for a large proportion of people, but it is equally not true for a large proportion.

    I do not recognise your association diagram for remembering something. My mind just does not build associations like that, not do I traverse them like that. I remember things because I remember them. I remember phone number by remembering the number, not by associating it with anything other than the owner. I remember where I was on Tuesday by where I was on Tuesday, not by remembering something i had to eat… etc… My wife and children however heavily use association. They make rhymes etc. Mental pictures. Yes I am sure that subconsciously my mind makes associations, but then these tend to be used subconsciously. Remembering a song based on a smell, a time based on a taste, etc..

    I absolutely do not recognise your time stream. Time is a concept that I do not use for retrieval of information, other than as an additional meta data construct.

    Now, files are nothing more than containers for data. And I need to place all containers in a place, whether they are computer files or hard real world objects. Books go on my shelf. Letters go in the filing cabinet. The cabinet and shelf are containers for my data.

    Software files are used the same, they are placed in the known constructs of documents, projects, pictures, etc.. yes they may be sub field ( as in real world filing cabinet ), but ultimately I know that my letter to the gas board will be in /documents/letters/utilities/gas board, that my short story will be in documents/writing/shorts, and that my pictures from my cousins wedding will be in /pictures/katie/wedding.

    I use scrivener in my writing, and this has the concept of a file as a container, not just for your text, but for anything related to the project. This allows you to place any information that you need within the same container, without it affected the output of the document. This is nothing more than having a filing cabinet or a folder for the specific project and all its documents. I can find everything because I know where it is.

    If I had to start searching for my data based on associations life would be far more complicated. I am completely happy to use meta data associations to enable me to find documents that might not be directly related to the file at hand. But this is about referencing information, not storing or finding a specific item. And is only relevant when you want to link information, as in your example or, give me everything related to my holiday.

    I think the ui for filing systems may change over time, but the simple concept of a file as a container will remain, because ultimately we have been storing things in containers for thousands of years, and ultimately there is a reason for that… it works.

  • http://itheresies.blogspot.com/ David Mohring (NZHeretic)

    You need file/directory and/or consistent URI systems because every attempt at providing a solely search based interface completely fails to scale to hundreds of documents and more than tens of people.

    You need to be able to make copies of documents and websites/services for reference and to modify things in parallel.

    You need consistent service-reference identifiers to pass references to documents and information between people.

    You need either a well managed hierarchical directory or shared managed ontological tagging scheme to manage any more than a hundred or so documents.

    I have seen the absolute mire that even small businesses can get into who rely solely on Google Desktop search or Microsoft search to access their documents/emails.

    What is really needed the universal adoption of a standardized multi-vendor consistent URL ( Uniform Resource Locator ) by all operating systems, local applications and remote services.
    To be truly productive you need to reference (link),copy, cut, view and include content constantly across all applications, vendors and services. That will require the result of all accessed data to be renderable into a document format and stored as a file.

  • http://softwaregreenhouses.com Marty Nelson

    I think time is a terrible organizing principal for humans, especially for our data.

    We organize nouns (people and things) based on our priorities and emotional connection or response.

  • http://twitter.com/vlb Vicki

    > As we go about our daily lives, we don’t open up a file for each of our friends or create folders full of detailed records about our shopping trips.

    Maybe _you_ don’t. But I do. Call them what you will, I find “files” to be a very handy data unit. I agree with Chris Wild in the comments: Files are nothing more than containers for data.

    I like my containers.

  • Me Again

    All we need is a few clever applications. The average person is capable of storing their pictures (files) in folders (names to group their pictures), and same for documents. Few of us are inclined to remember when or where a document was created, and less inclined to add lots of commentary to expand the search criteria or to help some sort of AI program locate your long lost / useful files.

    It’s no use telling us readers, tell Microsoft, Adobe and others, or develop it if you actually have a solution.

    Otherwise its a lot of complaining about something that really doesn’t need a lot of attention.

  • BIll OConnor

    There is a great article in NYT magazine about memory competitions (http://www.nytimes.com/2011/03/06/magazine/06letters-t-THEMEMORYCHA_LETTERS.html) The file metaphor works because making mental maps based on location works quite well across a lot of domains. Although all the semantic stuff sounds cool and reasonable, remembering relations between things is more difficult than locations and language is just too imprecise to work reliably to find stuff. Of course if you use the rather simplistic example it sounds nice but it doesn’t scale. That is not to say that metaphors other than file/directories could not work better.

  • http://gnosis.cx/publish David Mertz

    The gist of the article seems very misguided to me. A timeline as a structuring paradigm isn’t terrible if all data were personal vanity journals, but it just isn’t. And I do know that files have, y’know, modification date and creation date as part of the file system data, so it’s not like a filesystem fails to already provide that structure if that’s what one is interested in.

    However, only a small part of the was I mentally structure the data/content that interests me has anything to do with a timeline. In some trivial sense, yes, I first encountered, or learned, or thought about, some concept at some point in time, but only rarely is that temporality the key aspect. For example, here are two things (with my funny background and interests) I can imagine myself wanting to find (on my local computer/filesystem and/or on the internet):

    * When exactly did Python introduce generators? What PEP was it that discussed it, and what were the motivations at the time? Who was it that proposed them, anyway? I bet I wrote an article for my Charming Python series at the time; what examples did I give back then? And which article was it?

    * I like that old quote from Marx about “Hegel said history repeats itself, but the first time is as tragedy, the second time as farce.” What was the exact wording of it though? And what book did it come from? Didn’t I once mention it in an article I wrote? What do commentators make of the comment?

    In these cases the “when” (and also the “who”, “where”, etc. that Bowyer proposes as inherent to my mental organization) are exactly the questions I am trying to remember/figure out. It’s not that I necessarily think of these questions in terms of files, but still less do I think of them in terms of timelines.

    In the end, although I’ll probably use full-text search to find the relevant documents, it really is FILES that I want to find when I formulate these questions. I want that charming_python_NN.txt for my article. I want the PEP-XYZ.html that is somewhere at python.org. I want the online version of the Marx book (or maybe I want the very file-like physical book from my shelf). The idea that conceptually related information is closely grouped together by stitch binding, or by manilla envelopes, or by a FILESYSTEM, is something that is very old, very intuitive, and reflects millennia of human wisdom. Let’s not give it all up to make knowledge all more closely resemble Facebook or Twitter feeds!

  • Craig

    Doesn’t iOS do this very thing?

  • http://danzen.com Dan Zen

    It is good to read the various thoughts – here is some philosophical context I am working out over the last few years in many sketchbooks.

    In our information age, millions of logical people use XML to hold data in a hierarchy.

    We use OOP to model life and it is based on a hierarchical system.

    Our folders and files – a hierarchy, categorization / classification a hierarchy.

    Nesting, tabbing, indenting, brackets – all ways to show a hierarchy.

    We call the “containers” in the hierarchy, nodes.

    Everything above a node is its context. Everything in a node is its content.

    Any property of a object (tag) can be brought from its content up into its context.

    This allows an object to be placed in many contexts or branches.

    I, Dan Zen, am categorized under Dans, under males, under inventors, under male inventors.

    Any apparent network can be represented by a hierarchy (hence Internet based on servers with files).

    All hierarchies, the single hierarchy makes.

    We are all directly categorized under node 0 in the single hierarchy.

    Hierarchies are not as rigid as we are led to believe because of infinite arrangements.

    There are two forms of hierarchy which often make things confusing:

    1. classification with inheritance (is a)

    2. composition with members or parts (has a)

    Classification really does not exist except perhaps in the neurons of our brain.

    So really, composition is real life and in real life:

    Context is made up of ands (combinations) – it is in the past (this and that happened)

    Content is made up of ors (permutations) – it is in the future (this or that might happen)

    Life is converting potential energy (content) into kinetic energy (context).

    The node is now.

    Definitions exist in probabilities like quantum states in atoms.

    Paradigm – para meaning next to – would be an embodiment (sibling).

    Metadigm – meta meaning above in a hierarchy is the parent.

    Ideas are defined in patents in hierarchical form with embodiments and subclaims.

    Mindmapping is hierarchical.

    Trees are hierarchies reaching out for energy up and down from the seed.

    Any sentence is just a branch of the single hierarchy.

    A medium sits between – hot, medium, cold. The living, the dead, the idea and the creation.

    A medium is just a node. For example, a word is a medium – we can build with it.

    - NODISM –

    If you would like some visuals to go along with this, please check out http://nodism.org

    Follow http://danzen.com/+ http://danzen.com/- http://danzen.com/~ http://danzen.com/^

  • http://alexbowyer.com Alex Bowyer

    Thank you all for the feedback, it’s a good way to develop my own ideas… Having thought it over, I think it is evident that in my attempt to keep this post simple and high-level, I have glossed over a very important detail… what *exactly* am I proposing happens to files? That they would go away completely? In short, my considered response is no… Let me explain.

    As some of you have said, files will always have a place – at least in the sense of “a discrete bundle of data that represents something” or “a container”. But we *do* need to move beyond the idea of files, because they are meaningless…

    So what I think I should have said – what we really need to get away from – is the idea that a file can be allowed to be a bundle of data *about which the filesystem has no understanding*.

    So if you like, a more accurate title for this post would have been
    “Why general-purpose, not-understood-by-the-system files need to die”. Not as catchy though is it? ;-)

    The real problem, though, is that our operating systems allow data to exist in isolation, its only structure coming from the arbitrary folder locations that we may happen to assign.

    Imagine if every file on your disk could not be stored without some basic metadata about what project it relates to, or which vacation, or person. In semantic web terms, what if every file existed as a specific representation of an RDF resource (cf http://bit.ly/pYW3TM) – so the system would know what it was.

    All data has a meaning. It should not be possible to store it without that meaning. So yes, the main focus of this post is that the user interface paradigm, the way data is presented to us, would be more useful if it was something semantically meaningful to us. But my general point is greater than that – that there is a technical limitation caused by non-specific files – they encourage data to be unlabelled, disorganized, and disconnected.

    But rather than just force us to label our files better – the system can do it for us. And this is where all the sensors and semantic annotations come in.

    As for timelines – sure, they may not fit every conceivable user scenario. But at the technical level, time is the base unit by which things can always be linked, even when no other sensors are available.

    Right now files have filestamps, but filesystems are not organized to let you easily find all files touched at the same time, or applications/websites used in conjunction. Just this one simple annotation, combined with an interface to navigate between coincident files (and it need not be represented as a timeline) would be a huge leap forward in information retrieval.

    All the other annotations I mentioned would make it even easier to find. And the last 10-20% will always have to come from human intervention.. but hopefully if that is in the form of training rather than manual correction every time, we can avoid it becoming tedious, and the percentage will go down over time.

    I hope this follow-up clarifies what I was trying to convey. Yes, data needs to exist in some bundled entity – but that bundled entity should not be allowed to exist in a unstructured, meaningless format devoid of annotations.

    Linked Data, that’s what we need. Everywhere.

  • Waveney

    Interesting but flawed in that the concept of ‘timeline’ presentation sans file-structure, could only present incidents and events-without-content for each individual person; the information has to be stored somehow in a formal reference structure.
    Unless of course we are ‘all’ plugged in, wired up and on line 24/7/365… and that to me is a dystopian nightmare.
    Matrix reloaded anyone?

  • Vince de Vries

    I have given this topic a lot of thought and I have even tried to build a demo for an alternative way of storing data and retrieving it (but did not succeed, too complex for me and I found no-one to help me). I agree, the concept of files is obsolete in many ways (even if the whole world is still clinging on to them). But not in all ways.

    I think of files as a way of representing a view of data. In some cases that is quite simple. For instance a (digital, like e-mail) loveletter is only meaningful by it’s words and the way they’re organised. It is very easy to look at such a ‘document’ as a file. It does not have to be stored as a file, but when I open it, it should look exactly the same as it looked earlier when it was written. Here it may be hard to distuingish a file from the data.

    But a railway timetable is quite silly to look at as a file. It is a dynamic set of structured data and in it’s presentation you would never desire to see all of it (like the love letter), but only the part you need at a certain point in your life.

    Having said that, I would like to say that I do not believe that data changes. The train with the direction Y leaves station X at 10:34. That is a fact and it remains true, even if today that train no longer runs. It was a valid fact at a certain point in time and has now been succeeded by other facts. A loveletter still reflects love, even if the author completely lost that love and wishes you dead after you cheated on him/her.

    So time IS an important factor in the meaning of things and facts. But not the only. When I am at home and relax and feel like eating chocolate cake, time is almost meaningless as a guide to where I may find that. I could try to recall when I bought the cake, what I did with it and whether somewhere in time I may already have consumed it, but that’s rather tedious, since I can easily go to the cupboard and see if there’s still some left, that’s what most people do anyway. So if the chocolate cake is a file kind of thing, I’d rather retrieve it in a spacial of functional defined environment than in a temporal.

    But that is – I think – only a detail. Where it comes down to is that if a piece of information is well organised, it can be retrieved in many ways. And after it has been retrieved it can be represented in a way that has the look and feel of a file (i.e. a letter, spreadsheet or powerpoint). The presentation layer needs to be separated from the data so that it’s contents can be organised in a meaningful way.

    A computer uses files as discrete pieces of information, tagged with some metadata. But usually a ‘file’ consists of much more information than can be tagged. If I make a shopping list, I could choose to use Notepad. Which is not very helpful other than helping me remember what I need. A contemporary shopping list looks exactly the same, but under the hood every item is recognized as a symbol for something in the real world. That can be done quite easlily by just adding an invisble UAN to every item so the information about that item is readily available.

    The shopping list as a file is a stupid piece of bits and bytes, the same list stored in an intelligent way can make your life a lot easier (you folks may dream about the hundreds of ‘apps’ that you could build to handle such a list). This is not the future, it’s already happening.

    But there is a problem. That is semantics. The shopping list works best if there is a database holding all the information about items for sale in a logical, structured way. The old-fashioned way. Semantic organisaton of articles in a grocery store is a disaster and that disaster is also already happening. Finding or retrieving information that has been organised semantically is driving people crazy. I live my own life and the meaning that other people give to stuff is hardly ever of importance to me. Think about it. How many times have you seen something like “the funniest video ever”, watched it, didn’t like it. The one who thought it to be so funny may already have changed his mind but the word is still out and when I try to find the funniest video ever, I simply find videos that are not funy at all, at least not in my perspective.

    Or this: Wikipedia has a vast amount of information, stored in the dumbest form: plain text (ya yah, there are some interlinking mechanisms). So if I wonder what the hottest places (highest temperature) are in august, I simply can’t find them. Would I ask google for the hottest places in august I’d get a zillion of hits that are not at all what Im looking for. Semantics is related to meaning and meaning is a very personal thing.

    Organisation of data needs an abstraction that goes beyond personal meaning. Especially because we are social animals and need to not only retrieve our own data, but mostly others’. And the others have different way of naming things or they give different meanings and levels of importance to them.

    We need to get rid of files as the technical metaphore of bundled data and instead build databases. The data can later be retrieved and presented in a file kind of way, for the people who like to look at them like that. But we need to organise data in an abstract meaningful way. That is: for every item, idea, letter, tree, book or tv-show, we need a database entry that describes that item and it’s relation to other items. We need to distinguish ‘a tree’ from ‘my tree’ (where ‘my tree’ is an element of the set called ‘all trees’). Not by semantically tagging them or finding relations that come from random interactions that happen in the same time or space, but by pointing out in a very technical way what makes an item exactly that item. That is about dimensions, colors, hierarchical relations, functional relations. That is about defining waht we mean with the descriptions we use. That is about using symbols for things and relate the symbols to the words we use.

    When that foundation has been built, every person and application on this earth will at least be able to get the facts right and build is own layer of ideosyncracies on top of that. And yes, that may involve semantical or temporal relations. But under that, we still have the facts to which these relations relate.

    That may look like a tremendous effort, but if Wikipedia would have been written with better software, we’d be already halfway. So it is achievable and it is what I believe is the future. And indeed, it is only achievable when we note that the concept ‘files’ suffers from a terminal disease called obsoletosity. Files may be part of the presentation of data, they should not be the way data is stored or handled. With only three exceptions: databases are files too, and so are the software that we need to handle them and so are a binary documents. It may be technically neccessary to keep them in a file-structure so that the OS can handle them, but that is the only future I see for files.

  • Sameer Verma

    Take a look at Sugar’s Journal concept. No files or folders. The datastore is an activity-based timeline of events, which can be “paused” and picked up from where you left off.

    http://wiki.laptop.org/go/Journal

  • http://gnosis.cx/publish David Mertz

    The gist of the article seems very misguided to me. A timeline as a structuring paradigm isn’t terrible if all data were personal vanity journals, but it just isn’t. And I do know that files have, y’know, modification date and creation date as part of the file system data, so it’s not like a filesystem fails to already provide that structure if that’s what one is interested in.

    However, only a small part of the was I mentally structure the data/content that interests me has anything to do with a timeline. In some trivial sense, yes, I first encountered, or learned, or thought about, some concept at some point in time, but only rarely is that temporality the key aspect. For example, here are two things (with my funny background and interests) I can imagine myself wanting to find (on my local computer/filesystem and/or on the internet):

    * When exactly did Python introduce generators? What PEP was it that discussed it, and what were the motivations at the time? Who was it that proposed them, anyway? I bet I wrote an article for my Charming Python series at the time; what examples did I give back then? And which article was it?

    * I like that old quote from Marx about “Hegel said history repeats itself, but the first time is as tragedy, the second time as farce.” What was the exact wording of it though? And what book did it come from? Didn’t I once mention it in an article I wrote? What do commentators make of the comment?

    In these cases the “when” (and also the “who”, “where”, etc. that Bowyer proposes as inherent to my mental organization) are exactly the questions I am trying to remember/figure out. It’s not that I necessarily think of these questions in terms of files, but still less do I think of them in terms of timelines.

    In the end, although I’ll probably use full-text search to find the relevant documents, it really is FILES that I want to find when I formulate these questions. I want that charming_python_NN.txt for my article. I want the PEP-XYZ.html that is somewhere at python.org. I want the online version of the Marx book (or maybe I want the very file-like physical book from my shelf). The idea that conceptually related information is closely grouped together by stitch binding, or by manilla envelopes, or by a FILESYSTEM, is something that is very old, very intuitive, and reflects millennia of human wisdom. Let’s not give it all up to make knowledge all more closely resemble Facebook or Twitter feeds!

  • Dave Zaffery

    Alex, while the idea of replacing the “file” metaphor with something better may sound appealing I would say that you are not going to succeed. You fail to realize that using files “is” the natural way that most people organize information; how many still use a physical file folders to keep their bills and other important documents straight?
    Just because an idea is “old” doesn’t mean it needs replacing; in fact I would say that its very simplicity is why it has been so successful!! The most fundamental concept that made UNIX operating systems successful was that everything was a “file”, devices, memory, binary data, etc. That common metaphor made it really easy to write code that could easily inter-operate with each other. As I mentioned earlier, Files and Folder metaphors are natural ways to store things. The biggest change, and I believe only change, that is needed is the ability to better search and index all of those files, folders and data. That’s why Google is so successful.
    If you have a better idea that the File/Folder metaphor, then write it up for discussion. What I saw in your post wasn’t anything other than a your suggestion that its an old idea that, in your opinion, needs replacing…but nothing to substantiate your reasoning. File metaphors work because that are simple easily understood concepts that your Grandmother and 3-year old kids understand. So keeping things simple still works.

  • Huggermugger99

    I don’t think the author is quite sure what he is advertising here.

    On one hand he claims it is about representation and the user perspective, on the other hand he hints at what he believes might be a technical solution, namely WinFS. Not having done any research but trusting in his word that it is a relational database, last time I checked, these are based on files as well.

    And to give his inventive cooking in the IT kitchen the certain something, he adds Semantics for seasoning because next to the cloud and social networks that is the popular cuisine nowadays.

    So relational dbs and semantic tools, hmm, hold on the systems I have come across they don’t really want relational databases to do their reasoning and for good reason actually. Also they all suffer the obvious shortcomings of only knowing and inferring what they have been “taught” to do.

    Damn. That’s lame, almost as lame as plain old files. So are we now to define the relationships as the user for this to work. If semantic system comes across something it cannot relate, does windows come up with the clip and ask the user what he meant and store it then…

    The author tries to be a philosopher but the old Greeks were actually all quite practical and down to earth despite their musings. And they always aimed at proving their concepts.

    Anybody can pick a random bit out of convential computing and operating systems and propose it is deprecated and of course anybody should be even encouraged to do so (for the sake of progress) BUT I would expect them to then suggest alternatives that are not just swell sounding words that as a whole make little sense and wouldn’t probably live up to the simplest task that old fashioned files accomplish in an easy and fast way.

  • http://www.alexbowyer.com/ Alex Bowyer

    A couple of the comments criticized the post for not offering a cohesive or absolute solution to the problem I described. To be clear, that is not what I set out to do in this post.

    The aim was to present a vision of how different the digital landscape might be if all data was linked together and semantically understood by computers. And to give an overview of current trajectories that are trying to solve this problem along with a few ideas that might help us to move in the right direction. There is no silver bullet, yet.

    A technical outline of a specific solution is suitable fodder for a future post by myself or others but not within the scope of this post. (Feel free to add suggestions though!)

    One of my other aims was to start the conversation – which seems to be working! Keep the comments coming.

  • http://www.whatsthebeef.org whatsthebeef

    I think the point of this post has been missed by most of the commenters. Perhaps this is because in an attempt to enable a better visualisation of the point the author tries to identify existing examples or speaks on a overly technical level, which in some cases serves as a distraction. Also I am not so sure about the timeline stuff.

    I think the post is supposed to facilitate the removing of constraints when it comes to thinking of new ways to store and represent data. It is an early step in a process which may see an improvement or may not but the idea is to get people thinking. Being so dismissive seems to me to be counter productive.

    To combine the concepts of how a computer handles data on an operating system level and how it may be effective to present this data to a human seems to complicate things. It clearly was the intention that both concepts were considered together assumably one reason being that a mechanism needs to exist where when you perform add or edit data the action can be processed (adding metadata/annotations/links) by the operating system. This would be in contrast to having an application like reqall doing the processing which serves only one particular purpose.

  • http://applied-eclectics.org Gregor McNish

    Fun piece, especially the comments.

    As other commenters have suggested, this isn’t a radical new idea- Gelernter’s Lifestreams project, Sugar’s Journal, image based systems like Smalltalk/Squeak, even Microsofts Journal feature in outlook, limited as it is.

    For the purposes of the article, the underlying representation is irrelevant, it’s the user facing one that matters.

    If we’re getting rid of files, we should be getting rid of applications as well– just let me express myself as I want to where and when I want to. Why should i have to remember different applications, menus, commands for working with text, images, sound, music, math, programing etc. This is only a bit facetious– its different aplications that create all the different files. Smalltalk/squeak has some interesting things to contribute here.

    I don’t think it’s so much the file as granular chunk that’s the problem, it’s that it’s tricky to manage arbitrary collections of them, and especially tricky to do this in collaboration with other people. You want to be able tp summon all the relevant material– references, info created by others, things you’ve done yourself in whatever format, from current and previous work that may have been done in a different context, and you want to be reminded of work youve done previously that you may have forgotten about. You want to work fluidly with the material, throw it up on virtual buther paper by topic, by date, by author, and assemble it into a meaningful useful selection for the task at hand.

    Being able to work with links and link useful blobs of data is important here– a wiki can work pretty well for this, or a dedicated hypertext application like Eastgate’s Tinderbox. Nelson’s Xanadu was designed for the same purposes.

    I think the problem is it’s hard to create compelling alternatives without a critical mass of users, and without compelling alternatives, people don’t realize how weak their infotentional strategies are.

    Fun article, thanks to you and other commenters.

  • Sebastián Grignoli

    Enough with the wrong concept of qwerty keyboards being designed to slow down typists. It was designed to separate the most common used keys, thus, to speed up typists.

  • http://www.whatsthebeef.org whatsthebeef

    @Gregor McNish

    I’m in complete agreement except where you have stated – “For the purposes of the article, the underlying representation is irrelevant”.

    An original concept described in this post is the underlying representation. Many user side considerations for this subject are described very well here http://www.cs.yale.edu/homes/freeman/dissertation/etf.pdf.

    I think the reason the author tackles the issue of the underlying representation is because it’s fundamental to the concept. I don’t believe we are addressing an application which only function is to organise human consumable data into a more efficient format than the standard method of representation now. I think it’s more interesting when such a subject is considered a level below.

    Parts of a practical example I have in my head are as follows; When data is loaded to a operating system (not a specific application), the system would add metadata, if possible annotate content and correlate (storing) for consumption by applications which sit on the system, so IO libraries would receive a make over. Applications could then query the system for an entity, time, context etc. and receive a correlated representation of data related to the query as well as standard access to the data through some uid, this still suggests a blob type model where data is also held in it’s original form.

    The benefits would be system wide rather than to a single application and as data was added to the system the processing would be performed rather than having to load up through a specific application.

  • http://lifeforcedesigns.110mb.com/ Larry

    I am working on this, too.

    But I don’t think files are the problem.

    I think it’s the software that lies on top of them that’s the problem.

    I take my cues from how the mind works.

    The mind stores memory a little like motion picture film. But the software on top of that structure integrates the data into an entire mental system. Your mental limitations have mostly to do with how your mental software works, not on how it stores data. The challenge is to adapt the way our mental software works to the way our machine software works. With our current tools, it is a daunting task.

  • Sam's Dad

    Great job Alex. This would make for a great TED talk!

  • http://www.wappwolf.com Harald

    great article giving an outlook to the future – however at the moment everybody is struggling with loads of files….hmm just watch the video http://youtu.be/oBc4dOHWUio

  • http://drcoddwasright.blogspot.com Robert Young

    This isn’t a new idea, alas. Although the thrust may be different. Here’s what Linus had to say back in 2007:
    “… but Flash-based storage has such a different performance profile from rotating media, that I suspect that it will end up having a large impact on filesystem design. Right now, most filesystems tend to be designed with the latencies of rotating media in mind.”

    Whether one concludes that files, per se, should or will disappear, is an open question. I take his point to mean that BCNF databases (for transactional data, at least) will replace flat file storage. Depending on the engine, a table in RDBMSs may or may not map directly to a file. It need not, even now.

  • IanL

    I’m a little disappointed by some of the replies to this article that nit-pick on some completely irrelevant points. I’d hoped this would be a more professional forum than to display such petty behaviour. Please, can we stay on topic and look past little mistakes that have no impact on the point of the article?

    I found this to be an interesting article, particularly since I’ve followed this topic for quite some time. Whether we call them files or “discrete units of data”, I don’t think we’ll ever completely get rid of that particular construct. While data does have more meaning when placed in context with other data, there is still a need for physical separation of data (a contact gains semantic meaning when associated with a calendar appointment but the two can, and should, exist as separate and distinct entities).

    I think the problem is not so much with files (or whatever you want to call them) but with folders… it’s a limiting and inefficient approach to storing and organising data. We need to move to semantic networks of objects (discrete units of data, aka files)

    HCI studies have long found that the average computer user finds folders to be confusing. More importantly, even for users who are completely comfortable with folders, it’s easy to lose information in folders.

    GMail executed on this by taking the approach that a single inbox with labels, tags and advanced search was far more efficient than a hierarchy of folders when it came to finding information.

    There’s a really interesting book on this subject called TOTAL RECALL by Gordon Bell; it discusses data, and the meaning that can be extracted when data are combined. As Bell stored more and more of his data digitally, he initially tried to sort it using hierarchies of folders but it became too unwieldy and Jim Gray convinced him to switch to a flat file system backed by a database for organisation.

  • http://alexbowyer.com/ Alex Bowyer

    @whatsthebeef: Yes, looking at it both from UX level and tech level does make it more complicated – but you hit the nail on the head in your second comment – thank you. I don’t think this can be solved at the application level. I think our basis for data storage is a real hindrance to making any kind of progress towards solving data management in a meaningful way.

    @Robert, @Dave Zaffery: As @Gregor McNish points out, it’s not the file as a unit that’s problematic, but rather the filesystem – because it fails to relate and organize data in meaningful ways. A wiki is a good example of how much better data can be linked and crossreferenced than it is on a filesystem.

    @Gregor McNish – I absolutely agree about the need to do away with applications too. Not in the literal sense, but rather that it makes more sense to organize them by capabilities. For example, I have an image, what things to I want to do with that image – oh, Google has a service that lets me do that, or Facebook lets me share it, etc. Rather than, I want “work on images” so I’ll load Picasa, or I want to “social network” so I’ll load Facebook.

    @Larry – Sounds fascinating – can you tell us more about what you are working on?

    @Harald – Wappwolf looks interesting, thanks

  • http://www.cerny-online.com Robert Cerny

    This software may be of interest for you: Topincs. Think of a wiki that works with forms on a graph database. I recently wrote a paper that summarizes the main ideas: http://t.co/uFngk46 . Maybe you find it useful!

  • http://on-meaning.blogspot.com Yuriy Guskov

    Actually, it is what I advocate for the last years but in a little different form. File should not die (because they are containers of information, which we should have anyway), but we should use them explicitly as little as possible.

    The only things which we need to have this conception works is semantic wrapper for file. This way, we don’t have to remember exact file name (which we in many cases just forget). Semantics of the file is expressed not with a set of keywords, but rather as a set of filters which uniquely identifies information. This way, we will be able to get a file by remembering “it’s my photo from my vacation in 2011″, not by ugly /photos/vac/2011/IMG_2304.jpg.

    If you are interested, you can find more in (look for file wrapping):
    http://on-meaning.blogspot.com/2011/06/great-blunders-of-modern-it-and-their.html

  • http://halfmelt.com Aaron

    Floppy disks for save icons anyone?

  • P3T3R5ON

    I don’t know if this is a viable start point but what if data is stored in mass and our interface does the interpretation for how all the data is presented to us. For example, I play a game called Guild Wars, the game itself consists of two files, the launch program and the data file. What if you expand those into the OS and your Data. There is no need to have a file labeled as mypicture001.jpg, you just interact with a representation of the image in the OS and the computer does all the work within the data file. It could be making copies of it locally, sending that image to a printer or attaching it to an email.

  • liam.collins

    A while ago I looked into how to make an improved version of a ‘sharepoint’. As part of the overall design I came to the conclusion that the arrangement of files could be done using ‘categories’ when an individual file can exist in multiple categories simultaneously; some generated automatically while other would be user made.

    Take a photograph of a landscape that you took on holiday with your friends. The photography would, automatically, be in a default ‘photography’ category and also in one for the date; you could then put the photography into ‘on holiday in…’, ‘landscapes’ and ‘with my friends’ categories as well. Each would only hold a point to the original file so it wouldn’t take up much extra room.

    File attributes: now, how about adding a ‘copy-on-change’ attribute. Edit the file in the ‘friends’ category and you have a new file there (replicating a new entry in the ‘photography’ and ‘date’ ones), but the original would be kept as it lives in the ‘landscapes’ category.

    Searching would be interesting because you could create searches along the lines:

    “Show me the image that I took on holiday, with friends and I copied into the landscape category but changed that one”

    How you create the categories is up to you; this would include having categories within categories allowing the user to catalog their data in the best possible way for them.

    The impact on security of the ‘data store’, in the design, is also interesting, but that would require a more detailed explanation of the end design.

  • http://keywordsmart.com Jody Apap

    I think a more appropriate title is “why FOLDERS need to die”
    While there is the issue that files change as they are iterated upon, they are still nonetheless files (or sets of data).

    I think what’s really an anachronism (and beautifully illustrated with the secretaries in the smokey office) is the idea that any file, or and set of information can be usefully stored inside any “folder” No folder title can adequately describe it’s contents, so by storing a file in a folder, you are classifying it by only one of its many attributes.

    Maybe its just semantics, but FOLDERS seems to really be what we are talking about

  • Dave Goessling

    Agreed it shoudl be folders we’re talking about. There will always be discrete files. Despite the Web 2.0 evangelism of constant change and contribution, mash-up, social whatever. At some point there will always be entities that are “complete”, finished, a stake in the ground. Otherwise there’s nothing to reflect on, no constants to refer to.
    I agree with Jaron Lanier in “You Are Not A Gadget.” I don’t buy the whole “mash-up is the new art idea.” Sorry, but somebody actually had to play that cool drum pattern you’ve so blithly sampled, and they had to practice and subsume a lot of musical experience to do it. The drummer still had to play it, the producer still had to record it. That’s where that “file” came from. Just because you’ve “tagged” it / identified it as cool (and useful) to you doesn’t mean you had anything to do with its creation. And jsut cause you’ve saved a new “version” doesn’ mean you own it. The same is true across the “content creation” spectrum.

  • David

    I’m not on board with this – yet.

    When I walk into my house, I know where things are only because I’ve filed them away. Not because I have a servant who has organized my belongings according to their whim and provided me with metadata tags so that I can have the servant retrieve my fork from the cupboard when needed.

    You seem to suggest that we are first associative, then organizational. Or possibly not organizational at all. I agree that we use associative behaviors, but we also use organizational behaviors.

    To suggest that files are antiquated simply because they’re more organizational rather than associative, seems to me as naive: files are both.

    Do I need better associative mechanisms to search and retrieve them? Yes, I do. I recall the Google desktop helped me a lot in that arena. But I don’t think we can get rid of our inherent desire to classify and organize – which is simply what you’re suggesting we do, but in a different way.

  • CCF

    I must disagree. The need to organise has always been a trait of human even before the computer .. withness the massive filing and cataloguing systems before the computer comes along .. and in fact still exist.

    Like it or not, and call it by any name, FILES will continue to exist, whether if you call it DATA, BITSIES, ORANGE ir whatever. By REMOVING the filing system totally, you actually decrease the ability of the person to customise the data organisation to their own needs.

    And moreover, tagging actually requires us to organise as tag just like a file system and requires us to remember the convention which we use to tag the file. it is not as advantages compared to a simple filing system as you make it out to be.

    And also, to use your example, just because you send me a file, doesn’t mean that I want to keep it the SAME as your version. I might want to edit the document. Or I might want to draw on the picture, or add captions. Whatever it is, if it is automatically sync, then if I edit the picture and then FORGOT to rename the file, it is going to make you so mad because your original version is going to be replaced by my personalised version of the file.