My Photo

Search

ClustrMaps

Where's Wayne?

December 13, 2007

Serious Play: Are We Humans the Biggest Computer of All?

Would you be interested in a future where the pursuit of fun and enjoyment was one our our major roles in life because it leads to solving extraordinarily large and complex world problems? For example, what if all of us helped to digitize all the content contained in all the books in the world? Almost none of this content is currently available on the Web or in any digital form and as such, is largely inaccessible to most. And would you be interested in a world where the relationship between computers and humans is a very positive and symbiotic one? I know I would. 

I'd like to bring your attention to some of the ways in which this is already happening. Let's see if this excites you so much that you'd like to not only participate in some of these processes, but also start to use some of these models to help solve some of the problems you and your profession or areas of interest face.

In my previous posting "Moving aLOM", I mentioned some of the exciting, yet daunting, challenges of the future of metadata, such as how to create, in staggering volume, some of the more "subjective" metadata—things like the infinite characteristics that describe people, places, and things—where we humans are still the only source. This effort would include things like creating metadata for all the images and videos out there—still a largely unsolved problem—the absence of which not only makes them very hard to find, but also makes the Web and computers very inaccessible to the visually impaired, which, with age, might include a lot of us!

The Exciting Work of Luis von Ahn

Luis von Ahn   Manuel Perhaps most notable in this area is a relatively young new professor at Carnegie Mellon University, Luis von Ahn, standing at right in this photo with his PhD advisor Manuel Blum. Luis has already completed some amazing work on what he refers to as "human computation" and how to put "wasted" human cycles to use in solving problems a computer cannot solve at this point in time, but humans can solve easily. Luis also picks up on a theme we have discussed here on Off Course - On Target in many other contexts—the power of the "network effect" achieved by connecting everything and every one together. Human computation is obviously focused on the latter, and Luis wants us to consider having all of our brains connected together as an extremely advanced large-scale distributive processing unit. Not to worry, no wires or direct connections to your head are required!

Before I go any further, and especially if you are more of a visual and auditory learner, let me recommend that you immediately watch this talk called "Human Computation" that Luis gave on July 26, 2006, about the power of human cycles. This 51-minute talk is part of the Google Video Text Talk series (also highly recommended), and while it is long by some current standards, I feel very comfortable recommending this to you, since I'm convinced you'll agree it was a VERY good use of your time (actually Luis' talk only runs 40 minutes, and is followed by about 10 minutes of a good Q&A session).

Another excellent reference for you, which contains more fascinating details and examples of von Ahn's work, can be found in Clive Thompson's article "For Certain Tasks, the Cortex Still Beats the CPU"  in the June 2007 edition of Wired magazine.

Games with a Purpose

But for those who don't have the time right now to look at these things more, here is a quick synthesis of what I find so exciting and interesting about the innovative use of our human "compute cycles", and the use of "fun and games" for very significant and "serious" results. What von Ahn likes to call "games with a purpose".

captcha One of the most common and effective examples of this type of human computation is one of Luis' first applications, which is known as "Captcha". The name may be new to you, but I'm sure you're already a veteran Captcha expert! Captchas are those slightly difficult to make out words that you are asked to identify and type into a box when you are signing up for web sites. Captchas are used for responses online and in other situations where we want to prevent automated "bots" from generating unending amounts of "spam" or other undesirable exploitations of such online experiences. The problem is how to differentiate between a human response and a computer response, and Captchas are a simple solution to this problem, as well as a simple example of a problem that computers can't solve by themselves.

captcha scanning

In itself, this doesn't sound like that interesting of a problem, although certainly it is an annoying one! However, part of what I would see as Luis' brilliance is in the more primary problems he is solving with this process.

In the case of Captcha, the real problem being solved pertains to my initial reference about the challenge of digitizing all the content of the world's printed matter, such as books. For more background on this digitizing and scanning challenge, you may want to refer to my previous posting from Jan 2007 "Books—the NEW old medium". Specifically, the problem is with all the words found in printed matter that scanning and conversion technology cannot make out, because the medium has a crease running through it, or it is partly missing, or other factors which make it impossible for the technology to recognize the words correctly.  Yet, show these words to almost any one of us and we can easily recognize the word.

So all those "fuzzy" words in Captchas are NOT just some random words that are blurred to fool a computer. Instead, these are the images of words which scanning technology has failed to recognize correctly! Luis refers to this specific application as ReCaptcha and you'll find much more information there, as well as instructions and free plugins for you to embed within your own sites, blogs, etc.

And that's just one side of why Luis von Ahn was awarded one of the MacArthur "Genius" awards and a Microsoft Research grant, for he has also managed to put these types of solutions into a game format that starts to look at solving these kinds of problems at a scale that is truly breathtaking! 

Solving World Problems or Playing Solitaire?

In his talks, Luis likes to use a very compelling metric of human-hours, and he often compares statistics on the the amount of human-hours that are "wasted", in his opinion, doing something like playing Solitaire on a computer. I too have always been amazed at the number of people I observe when walking down the aisles of an airplane, for example, who are hard at "work" playing Solitaire, but I had no idea just how much time is spent on this. According to the statistics that Luis uses, over 9 billion human-hours were expended playing Solitaire in 2003 alone! Better yet, he puts this into perspective by comparing this activity to such things as:

  • The building of the New York City Empire State building, which consumed about 7 million human-hours, and thus equates to just 6.8 hours of collective Solitaire playing.
  • Building the entire Panama Canal, which took 20 million human-hours and amounts to less than a day of collective solitaire playing!

Metadata for All Images?

image Now imagine if we were able to put this kind of "human computation" to more effective use AND still do so within the format of games that people can enjoy doing themselves! One example is another one of Luis' creations, and one that has been running with staggering results for over three years called the ESP Game. As we've discussed many times, experiential "learning by doing" is often one of the best ways to learn about something new, so I'd encourage you to not only read about the ESP Game on that site, but to play it for awhile. (Caution: can be very addictive and time consuming!) When you do, you'll see how it puts two or more players (there is also a single player version) into a friendly competition by typing in descriptive words for a given photo (that metadata thing again), and they get points whenever they both type in the same word.

So what? While progress is being made in image recognition technology, this is still largely a problem that computers cannot do. And ask yourself, do YOU take the time to "tag" or create all the metadata for the photos and videos that YOU post, such as who and what is in the photo? Didn't think so. Yet by using this type of game format, the ESP Game has been running for over three years with no drop off in popularity and as of mid-2006, it was very fast, very cheap, and very accurate. If this were done as a popular online game site, it would be possible to label all the images on Google Image Search in just a few weeks! No surprise then that the ESP Game has already been licensed by Google in the form of the Google Image Labeler, and is used to improve the accuracy of the Google Image Search. We humans are relatively competitive animals and we like to do what we enjoy, so this approach appears to have a lot of promise.

Yes, but WHERE is that object in the photo?

Another problem that is even more challenging than identifying WHAT objects are in the image is identifying WHERE they are in the image. To do this, Luis has created another game called "PeekaBoom". The first player sees an image along with a word that describes an object within the image, and then clicks on the image where the named object is located. The second player sees only the object that the first player clicked on and types the word associated to that object. Once the second player guesses the correct word, the two players move on to the next image and switch roles. More details are explained in the video (you really should take the time to watch it!).

Human Computer Relations: Parasitic or Symbiotic?

Luis also notes how this transforms the current relationship between humans and computers from what he calls a parasitic relationship to a symbiotic one where:

"...humans solve some problems, computers solve others, and together we work to create a better world."

Sound far fetched? Well, in the less than two years that his limited experiment of the ESP Game has run, over 75,000 players have come up with over 15 million "agreements" (matched words). This rate would indicate that 5,000 players playing simultaneously could label all images on Google Images in about two months. Think about that...5,000 is NOT a very big number when you consider the numbers on many gaming sites. Therefore, it should be possible to label all the images on the Web in a few months. Again, I strongly recommend that you check out the video to get not only more details, but to see just how accurate, pragmatic, and promising this approach is. 

For example, it turns out that the results of a game such as PeekaBoom can in turn be used to help train computers to recognize objects and their location. Turns out that one of the reasons that computers are not yet very good at this type of object recognition and automated metadata generation is that there is very little data and examples to use to "train" the computers on how to do it. By capturing the results of all the human play in location identification of objects within images, this data can then be used to train computers to do the same thing"—allowing us move on to new challenges... and more fun.

Super Side Effects

I think you'll agree that this approach not only shows great promise in terms of solving some very large scale problems, but has some surprising and equally amazing "side effects", like how some people have used this to help them learn a language. This approach has spawned its own game called Babble, where two English-speaking players are shown a sentence in a foreign language that neither of them speak, and are presented with a list of possible meanings (in English) below each word. Players try to agree upon a set of English words that forms the most coherent sentence. The result is that this activity is surprisingly effective in translating foreign text into English without requiring anyone fluent in both languages. Think of the possibilities of this running at a larger scale!

Another "side effect" of this approach is how many players have noted that they end up finding other people who think very much like them, and thus they have a great sense of "intimacy" and closeness with their counterparts who play these games. Therefore, many  ask if they can find out who their anonymous competitors are to continue the conversation. At this point in time, all the game players are anonymous and no identities are revealed, but one could imagine this being used as a way to help discover other people "like you"—ones you'd want to meet and get to know better.

Common Sense Isn't that Common; yet!

And lastly from von Ahn's work, check out his new game Verbosity, which helps to generate what he calls "common sense facts" (again just more metadata really). One player is given a word and the other tries to guess what it is by completing fill-in-the-blank-type templates, such as "It is a type of ____" or "It contains ___".  The player who entered the original word can answer "true" or "false", but can't use the word itself. All this is very much like some party games that many of you have probably played, but the important difference here goes back to the original point of the summative network effect and how this can all be put to greater use. In the example that Luis shows in his presentation, the word "milk" would have some common sense facts such as;

  • It is white
  • It is a liquid
  • It is often used to eat cereal
  • It has lactose

Again, computers can not currently solve this kind of problem, and it is another an example of the need for massive amounts of metadata. Imagine if we started generating massive volumes of these "common sense facts" and they were readily available to all. 

More Competition = Less Carbon?

carbonRally_270x265 Lest you should think this is just a "one man show" from Luis von Ahn, I want to point out that there are many others who have been developing, adopting, and adapting similar models. For example, "Carbonrally: Carbon Challenge", which you can learn more about from the Nov. 20th, 2007 Webware post "Carbonrally: My carbon footprint's smaller than yours" by Martin LaMonica, is an application that is showing some great promise for improving the reduction in greenhouse gas emissions or the "carbon footprint" of individuals and organizations alike. Carbonrally adds the dimension of some fun and healthy competition to do better than others. As Martin describes it, Carbonrally is "tapping into people's tribal competitive spirit".

Whew!  That's quite an introduction to what I believe is both a powerful and profound pattern emerging—where the natural pursuit of fun, healthy competition, and challenges are combined into a game-based model that has already shown some of the ways we can solve large-scale present and future problems. It also creates a whole new relationship between us and technology. This model is not only interesting and fun, but it is a fascinating example of "user generated metadata", which I mentioned in my previous "Moving aLOM" posting.

Your Turn to Play!

Besides raising your awareness about "human computation" and the power of this approach, I also want to encourage all of us to put more time and energy into figuring out how we can inject more fun into work and other problem-solving situations. As you do so, I think you'll see an important job or task or problem  can be more fun if it's solved with some kind of game play, and where the solution remains very much a human one.

We can not, for now at least, expect computers to come up with such fun and game-based solutions by themselves! 

For starters, if you have websites or other applications where you have problems preventing spam or other misuses, consider taking advantage of some of the freely available plugins and nullities, such as those from the ReCaptcha site. Longer term though, please put some thought into which problems you could address with this model, and the ways you could do so by injecting the fun and challenge of a game-based approach into the more serious problems you need to solve...then share them with us here at Off Course - On Target.

I'm reminded of the great quote from Brian Suton-Smith who said:

"The opposite of play is not work; its depression!"

And I look forward to hearing all the innovative and creative ways you will come up with to solve problems—large and small—and replace depression with play. Have fun!

December 07, 2007

Moving aLOM

If you are a regular visitor here at Off Course - On Target, (OCOT) you know that metadata—characteristics that describe anything and everything—has been a major part of my life and a major focus for many years. If you'd like the full story of my initial recognition of metadata and its value, you can listen to or read my previous posting "Wayne's Wine Epiphany".

What is metadata?

Sometimes metadata is more commonly called "tags", such as the information you provide for things like photos that you upload or blog entries you create and search for. At a simple and personal level, metadata would include your name, phone number, address, family members, your likes and dislikes, skills, knowledge, etc. These are all of the literally millions of characteristics that describe, and to some extent, define you and the world around you.

Among many other benefits and uses, metadata is critical for improved "findabilty" and discovery, as opposed to searching. It is largely via metadata that we are able to find the "right"  people, places, and things (with "right" referring to our individual situations, context, and needs). This also works in reverse by enabling other people, places and things to find us, where appropriate and wanted.

What's been my involvement?

One of my more significant commitments to metadata started back in 1997 with the creation of the IEEE Learning Technology Standards Committee or LTSC, and within this committee, the formation of the Learning Object Metadata Working Group or LOM. LTSC is a group of volunteers who are devoted to development and implementation of standards for interoperability for use within the worlds of Learning, Education and Training (LET). LOM is a set of standards focused on the metadata required for more effective learning and performance.

I've had the honor of being the Chair of the LOM Working Group for over ten years, and this has afforded me the privilege of working with some of the most dedicated people I know. They have worked tirelessly, and often thanklessly, to produce several fully completed standards for metadata such as the IEEE 1484.12.1 standard for the LOM data model and the IEEE 1484.12.3 standard for the XML binding of LOM to enable the exchange of LOM instances (metadata records).

You may not understand or even be interested in these specifics, which is as it should be for most standards. How much do any of us care or know about such things as TCP/IP, HTTP, or the other standards which make the Internet possible? In a similar way, standards for metadata—of which LOM is but one—are part of what has enabled the improvement  of the creation and interoperability of metadata (though much is still needed).

To our surprise, LOM standards have been implemented broadly, both within the context of learning, education, and training, as well as within an eclectic and extensive list of other domains, including art, history, archives, and human relations. I know of no way to count the amount of such LOM-based metadata nor the number of implementations of LOM, but the numbers are globally dispersed and easily numbered in the millions and beyond.

What's Next?

Now it's time for both LOM and I to move on into our respective next stages and hence the title of this posting. As of January 1, 2008, I will be stepping down as Chair for the IEEE LOM Working Group, and I'm delighted to publicly congratulate Erik Duval for being appointed as the new Chair of LOM. I am about to make some significant changes in my roles and responsibilities, both personally and professionally (more on this in a future posting), and it is time for LOM and metadata overall to evolve to best fit the "Brave New World" we now live in. In spite of his relatively young age, Erik Duval has been one of the longest serving individual experts focused on metadata for learning, education, and training. Based on his work in metadata since the early 1990's, such as the creation of the ARIADNE project which is a large European based consortium focused on knowledge sharing and reuse, Erik was instrumental in the creation of the IEEE LOM WG from its very beginning.  Officially, Erik has served all this time as the Technical Editor of LOM and, along with Tom Wason, they created the initial kernel that grew into the full LOM standard. I could not be happier or more optimistic about the future of LOM and of the advancement of metadata than I am with turning over the leadership to such a capable individual and someone who has become one of my closest professional colleagues.

While those of us who first began to put this focus on metadata knew it was important for the future, I'm not sure that any of us could have imagined the degree to which this would be true or the scale of use and generation of metadata. To meet these new needs and scale will require both the evolution of metadata as we know it, as well as a complete rethinking. Some new leadership and energy will be of great assistance in making this happen. As such, the other main purpose for this posting is to bring your attention to some important and recent developments in the area of metadata; the first is a series of new activities within and related to the current LOM standards, and the second is addressing the longer term future of metadata developments—it's worth keeping your eyes on.

Where is LOM heading?

Here's a short overview of the new activities related to LOM:

  • Reaffirmation of the 1484.12.1 LOM standard, which is largely an administrative action required by IEEE for all active standards every five years.  As the name applies this is merely a check that an existing standard is still in active use and will continue to be so. As the millions implementing LOM can attest, this is very much the case.
  • Corrigenda for the 1484.12.1 LOM standard, which will provide a list of all the minor (but important) technical corrections and edits to the original LOM standard, which have been discovered by those previously implementing LOM.
  • Two New Parts for LOM:  After several years of work led by Mikael Nillson, the Joint DCMI (Dublin Core Metadata Initiative) / IEEE LTSC Taskforce has just initiated work on two new IEEE standards.  The previous link will provide you with access to all details of the work to date, previous meeting notes, and ways to contribute to these efforts.  As briefly and coherently as I can put it, these two standards are for:
    • Developing a Recommended Practice for Expressing IEEE Learning Object Metadata Instances Using the Dublin Core Abstract Model to meet the growing demand for interoperable definitions of Dublin Core Metadata Initiative (DCMI) metadata terms and IEEE Learning Object Metadata (LOM) data elements, which allow these to be used together in metadata instances.
    • Developing a Standard for Resource Description Framework (RDF) Vocabulary for IEEE Learning Object Metadata (LOM) Data Elements. In simpler terms, this standard will  address the increasing demand for definitions of IEEE Learning Object Metadata (LOM) data element semantics, which allow the expression of IEEE LOM instances in applications using Semantic Web technologies such as the Resource Description Framework (RDF). For some data elements, this expression can be achieved using existing, stable RDF vocabularies. The purpose of this standard is to define the semantics of data elements not covered by such vocabularies. This standard forms an important basis for making IEEE LOM useful in this larger metadata context.
  • LOM next:  Over the last year or so, we've discussed how we want to make LOM evolve over the longer term. The time has come to consolidate that discussion, gather requirements, and start thinking about how to meet those. Erik and the LOM Working Group have begun a series of open, regular, synchronous discussions in order to first bring everybody up-to-date on these activities, develop a plan of action, and then to begin the necessary new work.
    • These meetings are open to ALL and will be virtual meetings accessible both online and via phone.
    • If you are interested in participating, please either contact Erik Duval directly via e-mail (Erik.Duval@cs.kuleuven.ac.be ) or subscribe to the LOM mail list on the LOM web site. 
    • While those with metadata expertise would be especially welcome, it is equally valuable to get input from a diverse range of others who want to use and benefit from significant improvements in metadata for LET in the future. Please consider adding your input to this important effort.

Trends in Metadata

Metadata is often unnecessarily limited by the popular "data about data" description, but it is so much more than this.  Metadata is perhaps most often applied to "nouns", and my simple minded recollection of the definition of a noun is a person, place, or thing. To date, most of the focus has been on metadata for content (which has been very beneficial and for which much more work is still needed), but the future will include much more attention on the other "nouns"—people, places and things. This post would go on for much too long were I to do justice to any one of these or countless other areas that would benefit enormously from improvements in their related metadata aspects, so I will only list a few areas and provide you with a glimpse of the future potential within. Watch for future developments in metadata for some of the following:

Metadata about PEOPLE

    This kind of metadata, especially pertains to our skills, knowledge, abilities, experience, attitudes and competencies.

    In one small example, the IEEE LTSC Working Group 20 recently completed a standard for "Reusable Competency Definitions" or RCD, and this Working Group is now looking at other aspects of competencies that would benefit from standards. 

    Metadata about PLACES

      For example, we are seeing the recent surge of metadata in the use of maps, and GPS metadata is being added to things like Google Earth", which will enable us to answer questions such as:

      • "Where are you now?"
      • "Where was this photo taken?" 
      • "What does this location look like?" 
      • "What happened here in 1782?"

      Imagine the possibilities as more locations become "smart" with metadata about them and related to them. Photos and video might show what they look like now and in the past. Metadata will be increasingly available for every building, its contents, furniture, features, hazardous materials, fire extinguisher and escape information to name but a very few metadata elements.

      Metadata about THINGS

      barcode Metadata about things provides the characteristics of all the physical objects in the world, such as machines, parts, equipment, food, furniture, music...well you get the idea. 

      Add to this all the non-physical things, such as objects created in virtual worlds. Now imagine if all these "things" were connected and could start to share this information and "talk" to each other.

      You are already familiar with bar codes, which contain the metadata for everyday things, as well as the more recent use of RFID tags to electronically capture and broadcast all of this metadata. This is sometimes referred to as "the Internet of things". See the 2005 executive summary of the Internet of Things for one perspective and more detail on this concept.

      For example, imagine if all the ingredients in your kitchen made all their metadata available, such as how full or empty they are, when they are about to expire, which combinations might let you make a dinner along the lines of what you desire, and without a trip to the store.  It's all just metadata!

      To learn more:

      AUTOMATED metadata generation (AMG)

      Once you start to consider the massive amount of metadata that is required and possible for each and every person, place, and thing, you quickly "do the math" and realize the overwhelming problem of "How will all this metadata ever be created?" Our initial tendency has been to assume that metadata is all human generated—literally "typed in" to forms. If this were true, there would not be much of a future for metadata, since there is most likely more metadata than data and certainly more metadata than there are people, places, or things! 

      While human generated metadata, especially the more "subjective" metadata elements, will always play an ever more critical role in the future, it will become the minority of the overall volume of metadata. Increasingly, metadata will be generated automatically.

      To learn more:

      • See this article on AMG which comes from one of the many groups that Professor Erik Duval leads at KU Leuven, a prestigious Belgian university.
      USER GENERATED metadata

      Did you know that literally all the metadata for all the CD's and music you see displayed on your MP3 players, iPods and computers, artist name, title, album name, etc. is generated by other listeners, such as yourself and NOT by the record companies or publishers? What if we could tap into the metadata that each one of us (eventually all 6.6 billion of us) are probably generating every day, such as the tags and captions we add to photos, the PowerPoint slides we create, and search terms we use, to name but a few?  Such is the power of user generated metadata and there will be much work in the future to increase the generation of, capturing, and putting to effective use the flood of metadata that will result.

      ATTENTION metadata

        Attention metadata is a common term for all the metadata that captures your likes and dislikes, and which can help you find everything from great music to listen to, people to get together with, TV shows and video to watch, etc. We can think of it as the things we "pay attention to"...hence the name.

        Attention metadata is what recommender systems are based on. One such effort to address some of the needs for better capturing and interoperability of this type of metadata is that of the attention.xml group. You can listen to this 2004 podcast with some of the originators of attention.xml and this podcast and blog from Alex Barnett discussing attention related topics.

        Why would you need this? Consider shopping sites that track your buying patterns, and your opinions and preferences after such purchases, and use these to help you find additional items that you may want (if you let them). How does the system know if you are buying the item for yourself or as a gift for someone special? Currently they do not, and therefore the recommendations become less relevant and you likely stop using them. However as these issues begin to be addressed, there will be more and more "decision support" to help us deal with the growing problem of an economy of abundance and too much choice for those of us privileged enough to live in such situations.

        Metadata UNIQUE and SPECIFIC to LET

          While some of the metadata standards, such as LOM, are intended to cover the application to LET, most of the initial work to date has been much more general and largely applied to content. There is an enormous need for much greater focus on metadata that is unique and specific to learning, education, and training. This would include metadata to assist with evaluation and assessment—matching learning styles with teaching styles, and helping each of us as unique individuals to have LET options that are just right for us at just the right time and in just the right way.

          And trust me, this is but a minor scratch on the vast surface of but one slice of metadata and its very exciting future! 

          So LOM, for now....

          I certainly have mixed emotions about reducing my direct involvement in LOM and the development of some of these future metadata related topics. However, I can't imagine leaving LOM in better hands than those of Erik Duval and the many, many other dedicated individuals, old and new, who have such dedication and passion for improving learning, education, training, and performance and indeed the world in general, through better use and generation of metadata.

          Whether or not you consider taking an active role in this future development of LOM and metadata standards and specifications, I certainly encourage you to pay more attention to the role of metadata and how it serves as a fundamental principle in the future of your life, both personal and professional, and the future of the world around us.

          Wayne

          August 17, 2007

          Whither goes Web 2.0? The value of hype cycles

          DOWNLOAD AUDIO

          Web 2.0I’ve been concerned for some time that hype often interferes with the adoption of powerful ideas, especially when the hype prevents us from seeing how these ideas change the way we think or view the world, or otherwise provide valuable nuggets that we can use later on. This problem is no less true for what has been happening with Web 2.0.

          Jared Spool at User Interface Engineering apparently shares this same concern. I highly recommend you take some time to read his recent paper called “Web 2.0: The Power Behind the Hype" where he says:
           

          “Problems not withstanding, we still feel that this emerging standard, combined with other new tools, such as AJAX and open source infrastructures, makes for a new and exciting environment. There's been a tremendous amount of hype surrounding all these new developments, but, for once, we are thinking that there really is some power that is beneath the hype that is worth paying attention to.”

          Not only does he talk about the shortcomings of so much hype, but he also discusses a number of things that, parallel my own perspectives:

          "The speed and ease at which these new applications were built is what is getting us very excited about the potential of the Web 2.0 world."

          And speaking to the power of mashups, which I’ve addressed here at Off Course – On Target, he goes on to say:

          "Evocative of Dr. Frankenstein building a monster in his attic laboratory using body pieces he found lying around his neighborhood, people with a little skill can create new applications using common elements found lying around the Web in almost no time at all. As the skill requirements for building these applications are decreasing, we think this opens a whole new world of possibilities."

          Jared goes on to offer more examples of the emerging and lasting power of Web 2.0 characteristics such as APIs, RSS as an interface, folksonomies, and connections via social network, then finishes with an emphasis on the faster/cheaper nature of application development as well as some of the work remaining to be done. 

          Fortunately, Web 2.0 seems to be progressing through the hype cycle, a concept developed at Gartner, an information and technology research and advisory firm headquartered in Stamford, Connecticut. The Gartner hype cycle model is too restricted to technology for my liking, but I do find it useful as one form of "value filtering".   

          The first three stages of the Gardner Hype Cycle are a great test or filter for new ideas and technologies. If the technology survives the initial hype, then there is really something there of lasting value. I like to put the most focus on things after these initial stages, after the bubble bursts and look for the "residue" that remains—typically very valuable little nuggets we can gather, put to use, and hopefully lead to mass adoption.  Hence my bringing this hype cycle model to your attention as a useful technique and tool for your arsenal to help sort through all the choices and "next big thing".

          In my opinion, some parts of Web 2.0 are now moving into these phases the Gardner Hype Cycle:

          • Slope of Enlightenment, where the press has lost interest, but some businesses continue to experiment with the technology to determine its benefits and practical application, if any.
          • Plateau of Productivity where the technology becomes more stable, and the benefits become widely demonstrated and accepted.

          If you find this model valuable, you can check out this list of other industries and topics that Gartner has applied it to. You may also want to check their use of the hype cycle in their "Emerging Technologies" report".  The report has three sections, Web 2.0, Real World Web, and Applications Architecture from August 2006. I think many of the major themes mentioned, such as Collective Intelligence, Mashups, and Location Awareness, have lasting value.

          imagePhilipp Keller recently used this model plot out the evolution of tagging (creating metadata) since its inception about 2003.   

          Do you feel that you're caught up in a hype cycle?  With all the new tools, technologies and trends coming your way, are you finding it hard to sort the wheat from the chaff ? You might want to try to using Gartner's five phase model to plot these technologies out for yourself to help you decide what's worth keeping.

          Hope this helps and as always please send me your comments.

          w
          a
          yne
          =====

          July 02, 2007

          Context, attention, vanity and other powerful drivers of the future

          On June 23, 2007 in Vancouver, British Columbia, I was honored to give the keynote presentation at the second annual Contextualized Attention Metadata Workshop (CAMA 2007). This event, part of the Joint Conference on Digital Libraries (JCDL 2007), was very well organized by Erik Duval, Jehad Najjar, and Martin Wolpers,all from KU Leuven University in Belgium, and was additionally sponsored by the ARIADNE Foundation, ProLearn and MACE, each of which are worthwhile projects in the European Union. I recommend you check them out.

          I suspect that Contextualized Attention Metadata may be a bit foreign to many of you and so taken straight from the workshop description, here is what it’s all about:

          Contextualized attention metadata (CAM) captures the data on attention that a user spends on resources in a specific context. CAM enables us to better support the user in dealing with the information flood. Using CAM, filters can be devised that present new information only in the relevant context, for example by prioritizing incoming email based on the attention previously given to the topics of the email. Furthermore, CAM data can extend and amend user profiles thus enhances personalization in existing systems. CAM streams are collected from all applications that a user may interact with, including digital libraries, office suites, web browsers, multimedia players, computer-mediated communication and authoring tools, etc.

          However you describe it, CAM is relevant for most of us in everyday situations, because CAM is one of the fundamental enablers for the Snowflake Effect of mass personalization that I’ve been championing for many years. CAM is at the heart of what will make it possible for...

          just the right stuff (content, code, etc.)...

          to reach just the right people...

          at just the right time...

          on just the right device/medium...

          in just the right context...

          in just the right way.

          I’m sure you can add a few other words after “just the right” to improve this even more, but you get the idea.

          And this is NOT just a vision. Examples are already appearing, such as:

          • Finding just the right music to listen to (Pandora, Last.FM, Musicovery, ZuKool, etc.)
          • The latest dating technology, which is very good at helping you find just the right person and by changing the context of romance works equally as well for finding just the right person for any other purpose.

          If you consider this capability from a broader perspective, you start to see how powerful “just the right” can be as we get better at having just the right:

          • Things to read at just the right time
          • People to call when you have a question
          • Individuals for your project team

          And I’m sure you can come up with many more examples.

          This concept is easy to grasp, but turning it into reality is a healthy challenge. Figuring out what is “just right” for each of us at any given time and in any situation is no small task, and yet, progress is being made. Focusing on CAM will make it happen that much faster.
          Below, you can view the slides I used to support some of my comments at the workshop and download them directly from my Slideshare site.

          As you can see on slide 19, I emphasized some of the most predominant R&D efforts in this area, and noted my “wish list” of items that need more research, tools, utilities, and services for CAM:

          • Pattern recognition capabilities
          • Implicit and Inferred metadata capture
          • Visualization to process CAM to expose patterns (to both humans and machines)
          • Equivalent of the music genome project for content and context
          • Context REMOVAL (from content)
          • Synthesis and automation of “objectives”
          • Metadata automation
          • Online/offline solution for CAM (e.g. ability to track my actions, behaviors, and activities, whether off line or online, as much on the desktop as the browser
          • Standards for interoperability and mashups of CAM
          • Optimizing discovery

          Fortunately, I was able to stay for the rest of the workshop as well and thereby benefit from the other speakers and papers that were presented. You’ll find a full list of all presenters and their papers as well as all the slides and mp3 files of the presentations on the CAMA 2007 site.  But let me highlight just two that I think you’ll find particularly interesting:

          What I took away from Joe Pagano’s presentation, "Measuring audience attention across multiple channels for a new Web site" was their finding that every site is unique (the Snowflake Effect) in terms of how best to attract the most attention. In the example cited in the paper, they measured audience attention across multiple channels for a new web site Chronicling America, introduced in March 2007. Interestingly, for this site and audience, “online word of mouth”(OWOM) referrers were the most significant sources initially driving discovery of this site (see the following chart).Cama_joe_4

          In particular, what they called  “genealogy sites” (e.g. obituaries) scored the highest, followed by blogs, referrals from the Library of Congress site, e-mail, and lastly, search. It is likely that over time, search will become more effective as the more links to the Chronicling America site help to increase the site’s ranking, and this pattern is already suggested in the chart.  However, as Joe concluded, it also shows how OWOM plays a critical role, especially in the initial phases of the introduction of a new site or new content.

          Seth Goldstein, co-founder and chairman of Attentiontrust.org and one of the original investors of and advisor to del.icio.us, started the event with an interesting review of his observations of the CAM landscape from a more commercial perspective.  As Seth and the attentiontrust.org site put it so succinctly:

          Cama_seth

          Seth stressed the importance of adopting and respecting the fundamental principles of attention: property, mobility, economy and transparency. He also made the interesting remark that “attention is now media”.  By this he means that streams of attention, where people choose to stream/broadcast/share their attention to things like music through Last.FM, to web sites through del.icio.us, and to photos through Flickr, are now growing exponentially.

          You can see a tangible form of this “attention funnel” in Reblog, which is an “RSS aggregator for reading and republishing”. Reblog makes the process of filtering and republishing content from many RSS feeds easy and fast. Rebloggers subscribe to their favorite feeds, preview the content, and select their favorite posts. These posts are automatically published through their favorite blogging software, creating an attention funnel. Seth posted an intriguing blog entry last year about how “APIs are the printing presses of social media”.

          However, one of the more provocative observations that Seth made was his assertion that what drives online behavior is “vanity and popularity [which are] more powerful than things that help me” and that “publicity is trumping privacy.”  Attention is one of the scarcest of all resources and we all want more of it!

          You can think of this as “attention in reverse.”  Most of the work on attention is based on YOUR attention, what are YOU interested in, paying attention to, etc. Seth was noting the inverse; in his opinion, an even more powerful force is our interest in “Who’s paying attention to ME?”  We see this with such things as the great importance given to knowing how many people are reading my blog, visiting my web site, watching my YouTube videos, who has the most online “friends”, etc. 

          One recent example you might like to look at is atten.tv, which lets you either broadcast your clickstream to the world or watch what others are clicking on, all in real time. Seth sometimes refers to all this as the "Attentron”, which he describes as “watching people’s browsing patterns as entertainment.”  Seth has created his own version of this with Trakzor, which is a community driven MySpace tracker that lets you see who’s checking you out. This capability is also available on Facebook. And while it is all rather wild at this early stage of development, it is worth noting that Yahoo! purchased mybloglog.com, which lets you see who else has been looking at your blog.

          While I agree that this “attention in reverse” is a version of the very real human traits of ego and vanity, I’m not yet convinced that these are more powerful forces than the value we place on people and other sources of assistance—things that help us. But I do believe that “enlightened self interest” is both a powerful and very positive driver. The capture and management of context and attention metadata is key to harnessing this power and getting us ever closer to the vision of “just the right” and the Snowflake Effect.

          Warhol_5 My recommendation is to keep your eye on developments in these areas of context, attention and automated metadata and to do as much “learning by doing” as you can so that you have experiences of your own to reflect upon as you try out whatever versions and applications of attention and context tracking you prefer.

          And in the spirit of all of us liking more attention, send along your experiences and observations, as well as links to your blogs, articles, podcasts and videos. To paraphrase Andy Warhol, your 15 seconds/minutes of fame (attention) await you! <g>

          w
          a
          yne
          =====

          Andy Warhol, photographed by Helmut Newton

          June 01, 2007

          The Future is Surfacing

          For the past few years, I’ve been talking about digital surfaces predominating our future, and my postings, “The Old Medium becomes the New Content” and “P-Learning: Fill up your Tank and your Head?” pointed out some good examples of this trend. So you can imagine how excited I was to see today’s announcement of Microsoft’s ventures in this direction and I’m particularly excited about the name they chose.  It may not be a cool product name but it's hot stuff for me!

          At the "D: All Things Digital" conference on May 29th, Microsoft finally revealed a well-kept 5-year secret, code named “Milan”, and now unveiled as Microsoft Surface. This technology is part hardware and part software and some are referring to it as a “Table PC”. Read that term carefully—that’s TABLE as in “desk” and is not to be confused with the one with the extra "t", the Tablet PC, though it too may also benefit from this surface technology down the road.

          The hardware behind it is fairly straightforward, though a feat in itself, because it is a large multi-touch screen or display, typically about 30 inches in size, which is mounted horizontally, facing up like the surface of a small table. To have the imagery display directly on the touch screen required a combination of angled side projection and touch screen technology.

          Of course, hardware without software doesn’t do much. Add Microsoft Surface software into the equation, and things get really interesting. The software makes it into a full multi-touch display with lots of built-in features for gestures, recognition, etc. “Multi-touch” means that the screen responds to almost any number of simultaneous touches. You can do simple finger painting using up to all ten fingers at once, or perform more complex manipulations using multiple fingers on both hands, such as stretching out an image or window, or rearranging and moving windows around with your hands.

          You can also “pass” things that are on the screen over to someone else. Many individual users can physically fit around the “table”, interacting with the surface simultaneously. One additionally interesting fact is that Microsoft will produce both the hardware and the software, something they are already doing, for example, with the Xbox 360 and the Zune music player.

          It’s been quite a news item, so there are lots of links about it. Here are the ones that I thought give the best understanding and demonstrations:

          Of course, such multi-user multi-touch screens are not new, nor are they something Microsoft invented, and Jeff Han is often cited for his excellent research and development of multi-person, multi-touch technology and interfaces. It is well worth as much time as you can to look over his site. Jeff has a popular demo, which you can see below:

          And if you like that one, have a look at a similarly fascinating demo from Jeff that Fast Company posted called “Remapping the Universe”. Jeff has also recently started a company with the great name of  “Perceptive Pixel” to help commercialize and develop his work, so let’s hope we’ll be seeing much more of this technology move from the lab to our tables. Fast Company also has a very good in-depth interview with Jeff called "Can't Touch This" that is recommended reading.

          When you check out some of the demos of Microsoft Surface, you’ll also notice some fascinating additional capabilities such as the ability to:

          • Use some physical objects to perform tasks, such painting on the surface using a real paintbrush.
          • Use gestures that are reasonably intuitive, such as the way you would normally work to rearrange things on a table (move them around, stack them), but with the added ability to shrink or expand them, or have them include movement, such as animations and video. If you’ve seen the movie Minority Report, you’ll have the basic idea, and the demos will show you more in a few minutes, so I’d recommend you watch them.
          • Have physical objects “tagged” in several ways. For example, a “domino tag” can be attached or embedded into the physical objects, which are then recognized by the surface when you set them on it, and they can also communicate with the surface and computer to trigger further actions. This really starts to mix and mingle the physical “real world” with the digital virtual one. For example, you can read or recognize credit cards, loyalty cards, drink glasses, and paint brushes.

          Initially, Microsoft Surface technology is planned for use in businesses and high traffic areas, such as airports, cafes, and restaurants, casinos, etc., and comes with initial pricing to match—estimated at about US$10,000. However, we can expect that this technology will follow the same inevitable and rapid reduction in cost as other technologies, and will see equally rapid increases in performance.

          Iphone_2 The upcoming Apple iPhone and the some of the new Tablet PC screens, such as the new IBM/Lenovo ThinkPad X60, also have multi-touch displays, and are designed for the mass market, so they will introduce more of us to this new type of interaction and interface.

          Therefore, NOW is the time for us to prepare and think about how we could utilize this type of technology in the workplace, the home, and the classroom.

          • Think about using it to interact with maps.
          • Imagine every desktop surface in your meeting rooms and classrooms having this ability.
          • Imagine walls that are huge multi-touch surfaces!
          • Rather than a computer on every desk, what if every desk were a computer?

          Add to this vision some other recent announcements, such as the new addition to Google Maps called “Street View”, which allows you to pick any spot on a map and get a fully immersive set of 3D images that you can control. Microsoft’s “Photosynth” technology takes a large collection of photos of a place or an object, analyzes them for similarities, and displays them in a reconstructed three-dimensional space. Imagine working with these on a tabletop surface that you, your friends, family, students and co-workers are sitting around and can now interact with, rearrange, zoom in on, and explore together.

          I’m particularly interested in the area of human computer interaction, and specifically, bringing our human “action” (such as that of our hands) much closer to the action on the display. When you think about it, the gap is currently very wide, given our reliance on keyboards, mice, trackballs, and game pads. For this reason, I’m still a big fan of tablet PCs and believe they are still on track to hit a tipping point of popularity and ubiquity in the next few years.  As mentioned above, the new ThinkPad X60 tablet PC's have an optional multi-touch screen, so the trajectory continues to be clear to me.

          With Microsoft Surface, our hands and our gestures are now right up against the display images, about as close as we’ve come so far to being one and the same. Next, we will break through the limitations of two-dimensional devices and begin to have three-dimensional representations and haptic (force) feedback so that we can feel the objects and models, and begin to sculpt “virtual clay” with our hands.

          As additional dimensions and senses (such as smell, texture, time, locations, and sound) are added to this equation, the whole computer interface issue will increasingly fade and eventually become transparent.
          Then we can focus on what we are doing and the results we are trying to achieve. We can use our abilities to visualize and express our ideas for others, to do “digital prototyping”, and experience things before they are “real”. Of course, ultimately we will continue to blur the distinction between what’s real and what’s virtual, and literally redefine what “real” even means.

          So here is another set of examples where powerful new things are equal parts exciting and frightening. However as I like to point out, WE are the decision makers in all this, and it is up to us to make sure that the future that “surfaces” is one we really want and like!

          w
          a
          yne
          =====

          March 29, 2007

          The Search for Better Finding!

          As I often say to my audiences, “There is a big difference between searching and finding.” It’s along the same lines as the difference between shopping and buying, or fishing and catching. I’m not suggesting that one is “better” than the other, just that they are VERY different and often confused with each other.  Apply this distinction to the Web, and I suspect that if we were to evaluate our current practices and time use, we’d see that we spend a LOT of time searching and not very much time finding

          My focus (some might say, "myopia") on metadata is because I believe it is one of the keystone elements that can make for better finding of “the right stuff” (people, places, content, services, locations, events, etc.) more possible. However, other critical elements are the interface and human interaction layers and models for finding.

          Today we are almost completely dependent on or limited to the use of both textual models and variations on lists or search results to find what we want. While this is helpful in many situations and is not something I expect will disappear, I am anxious to see additional ways for finding the right stuff (which is not always what we think we want). One way to achieve this goal would be through the use of visual methods and multidimensional techniques.

          Don’t you find that often the most valuable things are those you discover serendipitously? Or how about those situations where you say, “Of course!” but they were not what you had directly been looking for?  Current keyword searching is not unlike trying to find a word in a dictionary that you can’t spell in the first place. It’s extremely limited, because you have to know which term to use in a search, there are no semantic “smarts” to the searching, and of course, it is all purely text based.

          So I’m always looking for and experimenting with alternative or additional ways to do more finding. The good news is that these alternative means seem to be growing, yet they typically don’t get too much attention and awareness of them seems very low. I suspect this is because we are creatures of habit. We are too busy doing things the “old way”, so it is difficult to be aware of better alternatives and to go through the challenge of changing (even though these “old ways” are relatively new habits that we’ve only been doing for a few years at most!). It’s that old “UNLearning” issue, yet again!

          This conundrum seems a bit like the situation, “I could really use some time management training, but I just don’t have the time.” If you suspect that you are spending a lot more time searching than you are finding, consider how much time you will save by acquiring some new finding skills and tools. With this in mind, you may want to check out some of the following ideas. This list of ideas is not meant to be comprehensive; rather, it will serve to show you the range of alternative methods, interfaces, and tools that are available. Hopefully it will also help you find other alternatives.

          Rafe Needleman has a very useful Web site that will help to keep you apprised of the growing field of “webware” or Web-based solutions and technology. I highly recommend that you check out his site regularly and even consider subscribing to it. Recently, Rafe had a post called “Five Weird Ways to Search” that covers a good range of options. I’ve used most of these options, including Grokker and Kartoo, for several years now. I don’t think any of them are “it” or “the next big thing”. However, they are great explorations, and each has something to offer.

          Based solely on my experiences, the two from this article that I recommend you try are Quintura and Grokker. I’ve used Quintura less, but it is a good example of tag clouds*, something that I find very useful and believe you will see much more of in the near future.

          Since our human behaviors are usually the biggest barriers to change, I would STRONGLY encourage you to try these out for at least a few weeks. Do some experiential learning (which seems all too rare these days) of your own. I think you’ll find, at the very least, that it will give you some good glimpses into the future of finding and help to change some of your thinking and approaches to it.  Try out a few of these and send me details of your experiences and of any other find/search tools that you discover so we can all benefit.

          *You can see a limited version of tag clouds, the collection of different-sized words in the upper left column of Off Course – On Target.

          And now for something completely different!

          While you are feeling experimental, try out some of these slightly “farther out” examples of finding the “right stuff”:

          Rafe Needleman’s post Art meets News on Universe shows you examples (such as Universe and the Digg Swarm) of the use of visualization in searching and finding. This technique adds a social dimension to visualization by providing a viewpoint on content from a collection of others. Look past the specific content used in the example (which is silly indeed) and instead see the experience of the interface when video and floating visual choices are used. Then be sure to check out Time Trumpet.

          Nostalgia_2 And, in my continuing quest to wean us from a text only existence, try out the recent introduction of Nostalgia for photos. It is a wonderful desktop application for Yahoo’s Flickr. And if you like what you see with these, be sure to check out some of the other cools apps from Thirteen23.

          Don’t be overwhelmed by all these new choices. Just give them a quick look and try out two or three to satisfy your curiosity in the search for better finding!

          w
          a
          yne
          =====

          March 16, 2007

          Snowflakes Galore at TechFest 2007 (Part 1)

          Techfest1 Do you often find yourself standing or driving somewhere and not knowing either where you are exactly or where what you are looking for is located? Or do you find yourself curious and wanting to know more about the building or structure you are standing in front of?

          I sure do and so I was buoyed recently by some of the R&D results featured at TechFest 2007, a showcase of over 750 global researchers put on by Microsoft Research in Redmond WA. One demonstration provided a good example of some themes I’ve mentioned previously here at Off Course – On Target, such as automated metadata generation, finding versus searching, contextual metadata, and personalization. To learn more about TechFest:

          Today I’m focusing on one item from TechFest that caught my attention, the ability to use a cell phone’s camera to trigger a map display and other relevant information about your current location based on the photo you take. This is also a good example of using audio or visual input and output rather than text to convey information. This type of photo feedback enables a more automated and immediate feedback loop to provide you with the information you need.

          To do the initial research and demonstration, the developers acquired millions of street level** pictures of Seattle, which they indexed in a database of distinguishing features. These were then matched up with reference information about that location.


          ** Not interested in being street bound? Then check out Sky Server which lets you walk around the sky the same as you can with Google Earth or Microsoft Virtual Earth used in their super handy Local Live Search!


          It will be a challenge to scale this for a large number of cities and other locations. However, we are also seeing some other very scalable phenomena that may very well make this all quite possible, saleable and I'd say probable. Consider the huge and apparently sustainable amount of photos and videos being uploaded to marquee examples such as Flickr and YouTube. These could easily provide the volume of photos needed for this technology.

          As more devices (including mobile phones) add GPS capabilities, precise location information will become available. See this recent review of some new smart phones with GPS. But we will still want a richer collection information triggered by the GPS data that would provide relevant to our situation and location. This might include photo images of what surrounds any given location, who else is in the proximity at that time, who we may want to meet there, and resources and services in the vicinity, such as restaurants, shops, Wi-Fi hotspots, parking spaces, and hotel rooms.  TechFest offered a number of possibilities.

          There’s lots more to report about TechFest 2007 and I'll talk more about it next time.

          w
          a
          yne
          =====

          March 06, 2007

          Music, Metadata and the "Onomies"

          Regular readers and listeners know that I see metadata as an integral component of the future visions for learning, performance, and probably most other things. They also know that I worry that we often suffer from a version of that old story/joke about the man looking for his lost car keys:

          You are out for a walk one night and happen upon this poor fellow who is down on his hands and knees on the sidewalk frantically searching for his car keys. You stop to lend a hand. After you'€™ve searched for several minutes without finding them, you ask him to recall where he last had them and he says he thinks he dropped them as he was locking his car. To which you say, "So this is your car here?" and he says, "  No, my car is down two blocks and around the corner."  Puzzled, you ask, "Why are you looking here then?" and he says ..........................................."  Oh, because this is where the light is better!"

          Sounds silly, but I wonder how easily and often we mimic this behavior of "looking where the light is better"? Seems to me that we often look in the wrong places for the wrong things, or don't shine a light in the right places. Specifically, I find that we often overlook some extremely valuable ideas, technologies, and solutions simply because they are not developed or applied for our specific domain or interests.

          Music_1 For example, the world of music is one of my favorite and richest sources of innovative examples of personalization. It absolutely reverberates with a plethora of great lessons. I'€™ll try to cover more in future postings, but today I want to bring to your attention to a recent blog posting from my Belgian buddy Erik Duval called "From folksonomies to taxonomies".

          Erik and I are mutually fascinated by music in general and the lessons it has for us. Erik references a recent posting in Duke Listens! , a blog by Paul Lamere, which has some very good examples of using the automated metadata generated by Last.fm.  Although I have some concerns with his specific example (which involves the use of music genres, such as blues, rock, heavy metal), he uses it to make some excellent points about how you can take the metadata (tags), usually referred to as "folksonomies" or "metadata for the masses",  generated by something such as Last.fm and use these to generate taxonomies.

          Duke Listens! is also a fun site for several other topics that I find particularly interesting including some of the amazing things that the middle school kids that Duke coaches build with Lego blocks, and his writings on speech recognition, tagging and music.

          I'll cover some additional lessons from the world of music in subsequent postings. Meanwhile, check out these examples of automated generation of metadata and taxonomies. I think you will quickly imagine ways you can apply this technology to your domain. When you do, please share these ideas here so we can all benefit from your insights, and hopefully trigger even more.

          w
          a
          yne
          ======

          December 15, 2006

          More on Metadata

          While catching up with my online reading after a very hectic and productive week in Berlin, I was delighted to find that my posting in October about a keynote I presented on the Future of Metadata and Learning Objects at the International Conference on Digital Archive Technology (ICDAT) in Taipei had stimulated a series of comments in other blogs. I was particularly interested in comments by Scott Wilson and Andy Powell who are very well versed in metadata, and by Stephen Downes, a prolific blogger and presenter on related topics.

          I always find it interesting how others interpret what I’ve written or said. Their comments serve to remind me that posting slides from one of my talks without the accompanying audio can make it difficult for the reader to know what I intended. When I posted the slides, I tried to fill in the missing audio using supplemental text. Stephen, perhaps wisely, often posts his presentations by capturing the audio portion, and then offering it as an MP3 file for downloading.

          After reading my postings again several times, I’m still puzzled as to why the slides and accompanying text were sometimes misinterpreted, but since each of their postings made several good observations and since more discussion about the important topic of metadata is much appreciated, I encourage you to read them.

          To further the discussion, over the next week or so, I plan to expand on some of these comments. Thanks again to Scott, Stephen and Andy for taking the time to read and comment on my previous postings.  I hope this will stimulate even more discussion by others.

          w
          a
          yne
          =====

          November 22, 2006

          Virtualization and UNLearning

          One of our biggest challenges going forward is the need for all of us to “UNLearn” our lifelong habit of putting things in the right “categories” or places. We were often told while growing up, "A place for everything and everything in its place." We’ve consciously or unconsciously transferred this metaphor and model to our digital life in the form of files with well-thought-out names that are filed in well-named directories (folders) and subfolders. We have done this for everything from e-mails to documents to photos.

          This works well enough for a while, but at some point, the volume exceeds the model. Do you have problems finding the e-mail, document, or photo you need? Do you have problems remembering what you called the file or directory that made SO much sense six months ago? Most of us do, and the solution is to STOP trying to solve the problem through well-named files and directories and take a metadata or “tag”-based approach instead.

          With this in mind, I thought you would enjoy checking out a recent video that ZDNet (Ziff Davis Network) posted as part of its At the Whiteboard series (see my next post for more on these whiteboard sessions). This session by Jack Norris, EMC's director of virtualization marketing, explained how file virtualization allows storage administrators to do more with less. However, even if you don’t have “storage administrator” in your job description, aren’t we ALL desperately trying to manage our storage more efficiently? In the few minutes it takes to view this worthwhile video, you’ll learn more about the suggestions I outlined in the podcast on how to change your thinking about files and how to store them.

          Jack actually has two sessions and I’d recommend that you watch the one on File Virtualization first  and then if you like that, learn a bit more about Global File Virtualization. Jack does a very good job of outlining why we are all facing the challenge of how to scale our solutions as we start to have stores of millions (and soon billions) of files.

          Our challenge though is not so much about how to manage the information, since this will increasingly be done automatically. Rather, it is about how to UNLearn what we have spent lifetime learning about how to put everything it the “right place” and instead adopt a model of assigning all the characteristics and attributes about everything. Remember the point is NOT searching, its FINDING!! Moreover, to do that we all need to do a LOT of UNLearning and relearning.

          w
          a
          yne

          =====