Google Sitemaps

Boris Mann
2006
24
07

Google's challenge: searching the live web

Blog
created on 周日, 2006-07-23 22:29

I'm looking forward to the Google "Behind the scenes" presentation put on by the Vancouver HPC Users Group (note: not a permalink; added to Upcoming.org). It's being given by Narayanan Shivakumar ('Shiva'), a Google Distinguished Entrepreneur and the founding Director of the Google Seattle-Kirkland R&D center. The abstract is as follows:

Google deals with large amounts of data and millions of users. We'll take a behind-the-scenes look at some of the distributed systems and computing platform that power Google's various products, and make the products scalable and reliable.

The bio says that Shiva is currently "excited about a variety of search and webcrawling technologies (including Google Sitemaps)".

I see the challenge for Google and all search engines to be "how to search the live web". One of the things I often explain is that I firmly believe that all static web pages will eventually be replaced by dynamic web pages. Another way to say this is that much of the content on the web, especially much of it which is being updated often, is actually being created by web apps.

For web apps, URLs are nothing more than keys to content. Type in www.domain.com/about, and the underlying web application will look up the content that is keyed to that URL. In fact, that "about" string is nothing more than a query to the underlying content "engine" of a website.

What is Google and other search engines? They are a centralized aggregator of all the unique queries of all the web apps that run websites in the world. Increasingly, they are having trouble keeping up.

聚合内容