Welcome back to the Saturday edition of How To Split An Atom where we ignore the usually boring news of the weekend and take a speculative tour into the future that I loosely refer to as Web 3.0. For today’s journey, I’m looking at how search will evolve as the Web becomes more semantic.

Web 3.0

For the sake of this conversation, I’m going to use the following definition of Web 3.0.

Definition: Highly specialized information silos, moderated by a cult of personality, validated by the community, and put into context with the inclusion of meta-data through widgets.

Specialized Search Engines

Search Engines

As it stands now, search is usually a hit or miss proposition. You begin the journey for any particular piece of information at one of the major content portals. You type in your query and you have results pushed to you that have been sorted algorithmically. For the most part, it works, but the biggest problem that search engines face today is context.

Dedupe

When I search for my name, for instance, I would likely end up with a much more famous version of “Steven” appearing at the top of the SERP. If I am interested in knowing who is talking about me online, the imdb page on Steven Spielberg is completely irrelevant. The Web 3.0 solution is one that Google and many others have been toying with for quite some time now, specialized search engines.

Searchlets

The work flow for systems like this are as follows. Before I ever query a term, I first choose my context. It could be something as broad as “authors” or something as narrowly defined as “Gainesville, FL authors”. This context acts as filter over which my query is run. A prime example of this is Google’s Blog Search. Quite a few times, I am not interested in an eCommerce site about the “iPod”, what I am interested in is the blogosphere’s opinion on the device. By allowing me to set my context initially, I got a lot more value from my searches.

Web 3.0 will expand upon this idea. Instead of thinking of a search engine in terms of a huge aggregation of “everything imaginable”. The search engine itself will be nothing more than a portal to smaller “searchlets”. Lets not confuse this with a directory structure. In directory based search, you’re forced to wade your way through often obscure multi-level link trees to find information. It also relies strictly on a human being to sort that information properly. This leads to tiny, often irrelevant datasets.

Tagging

In Web 3.0 search engines will need to have a better understanding of “context”. One way to accomplish this is to take a nod from directories and allow results to be tagged. These tags can be voted on by the community and would only be an addition to, not a replacement for, traditional sorting algorithms. Thus, if an eCommerce site is tagged as being a source for information on “iPods”, the community has validated this with their votes and the algorithm acknowledges that this is true, it would appear high on the listing for searches within the context “iPod”.

Context

Context is the major driving force behind all Web 3.0 thinking. As the amount of data we are subjected to on a daily basis increases, the only way we will have any chance of using it effectively is if systems are put into place to allow us to refine our context. Everything in the terrestrial world works like this.

When you are looking for a book, you go into a book store or library. If you are looking for a movie, you go to a movie theater or video rental shop. Nowhere in the natural world is there an “everything” store that just contains a hodgepodge of unsorted products. Schools are broken into classes and Malls are broken into stores. The point is that in the “real world” when we ask a question or look for something, we get answers that are relevant to the context we are currently in. In order for search to truly evolve, it must act like this.

Related Projects: Swicki, Google Blog Search, WebMD

Natural Language Search

Language

The second biggest hurdle to search as it stands today is that we can’t really ask search engines questions. The issue has always been that search engines don’t understand context very well.

When people ask each other questions, there is generally enough feedback available that allows us, with very little trouble, to understand what the other person is “really” asking. If someone who is coughing comes up to you and asks, “What do you know about the common cold?” chances are good you will recommend a decent cough suppressant. Machines don’t have this luxury. Up until now, the answer to the question has always been to either ignore natural language search or to tell the users of such an engine to be more specific or to use more strongly phrased questions. Web 3.0 is a web that understands context, thus in it the power of natural language search can be more fully exploited.

Search My Past

If, for example, I have spent a lot of time researching the causes and cures for a cough and all of my searches have fallen into associated contexts, the engine will be able to understand that when I query it, “What do you know about the cold?” It will know that I am not talking about what it knows about the Antarctic, my real concern is in the common cold and its cures.

This sort of intelligence will require that we change the way that we understand search engines. Search engines will become full web services that we will have control over and be able to train to understand our behavior. Instead of it taking the moving average of the populations behavior like the current trends dictate, it will start with this moving average and become more personalized to our needs as we use it.

Privacy

In order to make this useful, stronger privacy infrastructure will need to be put into place. As likely as not, these search “profiles” would be stored locally instead of being kept on the search engines servers. The advantage of this is that these profiles would then be portable to other engines and could be loaded or not at the searchers discretion. Storing this information locally would also somewhat limit search engines ability to use this information as demographic data for advertising, unless the end user wished for that to be the case.

Digital Body Language

Having a universal search profile would also be useful to “flesh out” our digital persona. What machine lack right now is the digital equivalent to body language. They have no way of understanding us based on their interactions with us. Having a portable, shareable, locally stored search profile will allow us to share information with web applications that will allow us to interact with them in a way more reminiscent of real conversation.

In the identity space, systems like OpenID are doing a tiny subset of this. They are giving us the ability to take our profile data with us. In the Web 3.0 world this will be expanded to include a much larger set of information.

Related Projects: Powerset, Ask.com, Google Search History, OpenID

People Search

A huge part of Web 3.0 search will be surrounding “People Search”. As our social networks expand, and more cults of personality make their way into the digital wastelands we will want ways to find out who is who. My article on Blogging 3.0 is a good place to start to get a better idea of how this idea of cults of personality will fit into the Web 3.0 landscape.

What Web 3.0 will allow us to do is not just find websites related to concepts, but using natural language we will be able to find answers to questions from experts who have written about them previously. Think of it as a melding of Digg and Google’s specialized search engines. If, for example, you wanted to know about the common cold and you found a great blog post on curing it. If you voted for this post and others agreed, over time when someone asked that question, or more importantly if someone searched for that author, what would appear is a listing of that person’s “core competencies”. It will contain articles, profiles, images, videos and so on that the Web most closely relates to that person. Since we are dealing in context, the results of this search would be as good as the context you are in. I, for example, would neither appear in searches around the common cold nor searches for “movies”.

Expert Systems and Guided Search

Guides

My article on Web 3.0 technologies goes pretty deeply into how software agents and expert systems will be important in Web 3.0, so I won’t touch on them here. The final point I would like to make is where “guided search” systems like Mahalo and ChaCha will fit into the equation.

Guided Search

Guided search engines always belong in the context of their creators. The reason that guided search, in at of itself, is not sufficient is that it ignores the “wisdom of the crowds” by seeing search through an editors eyes. Guided search solves the problem of context while ignoring the problems associated with a purely editorial infrastructure.

The future of systems like this is in combining them with more traditional algorithms to produce a search engine that allows you to “fill in the blanks” with the aid of guides. Guides and human based search is powerful when the other types of search have absolutely failed. If, for example, you are looking for some very specific piece of information on an obscure subject matter, a search engine quite often fails to “understand” what you are trying to accomplish. Editorially powered search, when combined with fast search algorithms, natural language search and a strong database of previously answered questions could plug this hole.


Related Projects:
Mahalo, ChaCha, About.com

Summary

  • Search engines will be replaced by smaller, specialized searchlets
  • Search engines will be able to understand context through tagging and community interaction.
  • Search “profiles” will become portable, allowing us to have the digital equivalent of body language.
  • Natural language search will be improved once search engines have a stronger understanding of context.
  • People search will become more important.
  • Guided / Editorial search will be a stopgap where search engines still fail to provide relevance.

Web 2.0 Roundup

Here is some required reading material to help you catch up on Web 3.0.

Defining Web 3.0
Enabling Technologies For Web 3.0
Blogging In Web 3.0
Advertising In Web 3.0
Media In Web 3.0
Conversational Advertising
Web 3.0: A Case Study

[Be sure to subscribe to the RSS feed before leaving]