I've been hearing about Google's implementation of Latent Semantic Indexing (LSI) for some time now. In a nutshell, the buzz is that LSI will change the On Page SEO game completely. The On Page SEO process currently consists of some the following evaluations among others:
Keyword Density
H1 tags
Meta/Title Tags
Tags matching anchor text
Interlinking/Navigation
Well Written Targeted Content
Do we now scrap the keyword density checkers you've been using and start reading articles about IR, Heuristics, and matrices as they are being used to calculate the overall "theme" of a web site based on the probable relationships between keywords as opposed to density?
Here is an excerpt of the patent filing by Google:
"The system is further adapted to identify phrases that are related to each other, based on a phrase's ability to predict the presence of other phrases in a document. More specifically, a prediction measure is used that relates to the co-occurrence rate of two phrases to an expected co-occurrence rate of the two phrases. Info gain, as the ratio of actual co-occurrence rate to expected co-occurrence rate, is one such prediction measure.
Two phrases are then related where the prediction measure exceeds a predetermined threshold. In that case, the second phrase has significant information gain with respect to the first phrase. Semantically, related phrases will be those that are commonly used to discuss or describe a given topic or concept, such as "President of the United States" and "White House." For a given phrase, the related phrases can be ordered according to their relevance or significance based on their respective prediction measures."
So basically, anything that help determine the topics, contexts and themes of a given page, like industry terms, synonyms, buzz words, acronyms, etc, will be more than ever, very useful and impact the way your page gets ranked. The relevancy of theme based words will be playing a key role this year and beyond. New factors such as LSI and theme based relevancy have been touted in recent years as the next frontier of ranking pages and combating keyword spam.
If LSI were ever implemented in the non-homogenous environment, on-page optimization will become as critical as the off-page, which will have to be properly implemented so they complement each other in a new way that satisfies both your visitors and the search engines. Every person interested in achieving top ten rankings under the new LSI-driven environment should understand the basics of this methodoogy, and how to comply to its requirements. Is it being implemented? Maybe to a degree.
There's an important twist. LSI is actually a real technology, however, some experts say that using LSI in the non homogeneous world wide web is not practical at all and LSI is simply the latest snake oil that that some SEO's put in their pitch.
Here is a great link on this topic from which I will borrow a quote:
http://www.seo-blog.com/latent-semantic-index-lsi-myth.php
I believe this article clearly explains why the use of LSI on a massive scale is impractical and is currently not being used as some SEO purveyors claim. So, if you've seen or heard a pitch from an SEO or a software designer who can apply LSA (Latent Semantic Analysis) or LSI to your site to increase your rankings, in my opinion, they are full of shit.
This author cites a defining quote about LSI and it's application in evaluating sites on a large scale:
"Professor Michael Berry head of the Department of Computer Science at the University of Tennessee wrote me as follows 'Just for the record, LSI has been used to index on the order of 10 million documents using out-of-core SVD based techniques so you could apply it to subdomains of the Web but the entire Web would be problematic as you point out'”
Saturday, March 28, 2009
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment