Saturday, January 01, 2005

(Part 1 of 2) Shaking the Information Tree
Proposing a Content Force Strategy for the 21st Century

Life grows in branches. Like rivers, human arteries and Web sites, things tend to start somewhere, then branch out and become more diversified. It's natural, then, that our main paradigm for systems of organization – from libraries to Web taxonomies — is the familiar directory “tree” model.

As an information architect, that's what I do — help people organize their information hierarchically by major categories and subcategories. And that's the way many people navigate or browse for their content (search is a different thing) – they find the general heading they're looking for, then drill down through each subheading, and if they're lucky, they find the content they want.

Generally speaking, when people are looking for a content object within an organized database, they will begin with the general and move to the specific in methodical fashion – what we call hierarchical browsing. However, the Internet has shown us that human beings are not only capable of navigating content in less methodical ways, but may even prefer different search strategies depending on the context.

Here's an example of a site with a simple, single hierarchy for users (remember, this doesn't mean that the actual physical organization of the files on the server mirrors this hierarchical scheme:

  • Home
  • About Us
  • Our Products
  • Gadgets
  • Gizmos
  • New Gizmos
  • Our Best Sellers
  • Whatzits
  • Our Clients
  • Downloads
  • Contact Us

In this simple content architecture, a user who is looking for a trial version of a new Gizmo might enter at the homepage and go directly to downloads, or if they wanted more information, they might go to the Products page, then drill down through Gizmos to New Gizmos and hope to find a link to download the trial version of a new Gizmo.


Any Web designer or information architect worth his or her salt would be able to convince the business owner that this is a logical, usable, intuitive and well-ordered hierarchy for the site.
However, as sites grow and deepen, content naturally cross-references itself across the hierarchical lines, e.g. users enter at the Downloads page instead of the homepage, they visit the Clients page then jump to a product linked on the Our Products > Gizmos > Our Best Sellers page. Or they bookmark / save the URL for the Gadgets page and never notice the Gizmos page because Gizmos are not in the persistent site menu.


Navigation schemes

Knowing this, most complex sites now incorporate several navigational approaches for users. You might have several persistent menus that allow you to navigate in different ways based on your needs/interests; you might have contextual menus that display different content depending on the page you're visiting; you might have personalized, customized pages that display different content choices and navigational options based on your past browsing habits or stated interests, etc.

Nevertheless, the human mind is such that we still like to have a hierarchical option because we want to know not only WHERE something fits into a site, but HOW it fits into the site in relation to other content. Thus, it's helpful to know that Patrick O'Brian's books might be under Bestsellers > Fiction > Historical Fiction > 19 th century fiction > British authors because you might, for example, also want to know that there are other categories or sub-categories you can explore that are similar.


Keyword queries

Even though Web content is increasingly database-driven, especially in eCommerce pages, and effective taxonomies, metadata and search indexing allow users to find what they want simply by querying with one or two key words, we still want to know where something fits in relation to other content on a site. Using keyword searching without any navigation scheme would be like blindly fishing for coins at the bottom of the ocean or, well, like searching the Internet using Google.

n the deep, dark void of the Internet and the invisible or hidden Web, you know there's gold to be mined, and you know you can retrieve 25 million results with a single keyword, but in the end it's the Web sites you visit that give you a better sense of what's on the Internet relative to other sites. Likewise, any site with an effective search engine will help you find what you're looking for, but it can't help you see whatever you're not looking for. Like the saying goes, to know the right answers you need to ask the right questions, and you can't ask the right questions until you can see the (virtual) environment in front of you.

Shaking the tree

We can see that people 1) are familiar with a hierarchical structure for information, in which they go from the general to the specific through a series of nested folders or categories and 2) that people also like to search using a key word that relates to the information they are seeking.
Now ask any Webmaster – what are the two of the most common complaints you hear about Web sites: 1) I couldn't find what I was looking for because your navigation doesn't make sense to me, and 2) I couldn't find what I was looking for because your search engine didn't return the right results (or returned too many results.

How do we shake out the apples from the hierarchical tree that's not only binding our navigational logic but also making our search queries impossible? One way is to constantly improve the technology whereby data is stored and retrieved, but that's only avoiding the main issue (which remains an issue for those who are managing the databases). That issue is, how can we cluster information together as tightly as possible but still pull elements of it out of that cluster as we need it? The key here is conceptual in nature, not technical or mechanical. They key is that once it is amassed it should never have to be extracted, but rather, it should be intelligently interconnected.

Metadata NOT ENOUGH

"Metadata gives your content context. Content that does not have effective metadata is not web content. It is sloppy, next-to-useless print content that has been unprofessionally published on the Web. If you don't have time to publish professional metadata for your content, you shouldn't be allowed to publish anything on a website."

One approach has been to ensure that all information has metadata attached to it based on its facets. An image of the war in Iraq is all of these things to a database: an image, an image of Iraq, an image of war, an image of (other specifics), a file with certain attributes such as date, size, author, etc, a file linked to other specific files, etc. With the right search engine you could potentially locate this image in a database according to any one of the above facets – you'd search for images of Iraq taken by so and so on a certain date, for example.

But metadata is created under the assumption that the facets of language, a form of visible code, can become structured in a way that can allow it to be searched for, deconstructed, repurposed and somehow distilled into elements of knowledge that can stand apart from the mother document and still retain its original context and substance. For those who use language as a tool, this is a quaint notion, and problematic.

...to be continued.

0 comments: