Sunday 11 January 2015

The Myth of Google's 200 Ranking Factors

The Myth of Google's 200 Ranking Factors

- Posted by to Advanced SEO, Basic SEO and Search Engines
The author's posts are entirely his or her own (excluding the unlikely event of hypnosis) and may not always reflect the views of Moz.
The woman in the gif below just said to Captain Picard that she can show him the definitive and complete list of the 200 Google ranking factors.
Picard, who is a wise man, can do nothing but walk away with a facepalm.
Who can blame Captain Picard for his reaction? We all know, in fact, that a complete and ultimate list of the 200 ranking factors does not exist.
If you agree, then why do we still see statistics like these below on  Buzzsumo?
Let me offer this disclaimer before I continue:
I am not writing this post to attack people like Brian Dean, who, in August, published an update to the "complete list" that Backlinko first presented in 2013. Brian, whom I esteem, created an effective piece of link bait (as the  318 linking root domains it earned testify).
I am writing this post because those lists are, quite simply, useless and dangerous, and because I hope to help people—especially the newer generations of SEO—understand that a definitive and complete "List" of Google's ranking factors does not exist. Moreover, some of the factors that appear in those lists:
  1. Are myths;
  2. Are correlation factors and not causal factors;
  3. Are presented just to reach the number of 200.

The origin of the myth

I admit that I did not know how the myth of "200 Google Ranking Factors" was created, but a good SEO pal of mine,  Giorgio Taverniti, revealed it to me.
The first time Google declared it was using 200 ranking factors was in its  Press Day on May 10th, 2006 (you may also want to read the live blog Matt Cutts did, as it illuminates many things that happened thereafter). 

Seeing that the correct phrasing was "over 200 ranking factors," we can say that "200" was an approximated number, perhaps offered to journalists in order to explain how complex Google's algorithm is. If the audience had been composed of information technologists, Alan Eustace would probably have used another wording.
Another proof of how silly it is to claim to have discovered "the 200 Google ranking factors" is that, in 2010, Matt Cutts himself declared that, yes, Google counts on over 200 rankings factors, but that  each factor may have up to 50 variations:

Meaning is important

Are you sure you really know what "ranking" and "indexing" mean?
I ask you this because I know many SEOs who use both words as synonyms, when they are two completely different concepts and stages of how a search engine works.
Indexing is one of the four interconnected and interdependent phases of how a search engine works:
  1. Crawling
  2. Parsing
  3. Indexing
  4. Search
Indexing is the process of locating and mapping resources around the web that are associated with a word or phrase, and it is something the search engines do, not SEOs, even if SEOs can help their work optimizing a site. 
The index, as was so effectively explained to me by  Enrico Altavilla, is used to determine what resources to suggest as an answer to a query and the words/phrases composing it, not in what order to suggest them. That is the function of the ranking phase.
Ranking is the final moment of the fourth phase: Search.
Context plays a major role in the Search phase, and almost every step takes into account the user's and device's characteristics.
As we can see from the image above, the Search phase is composed of four distinct stages:
  1. Understanding the input given by the user with a query. Hummingbird very likely operates in this moment, because Google, in order to understand better the input, modifies or extends the query and just after moves to the second stage;
  2. Retrieving documents from the Index, taking into account commands like "noindex."
  3. Filtering & clustering. Once Google has understood the input and retrieved the corresponding documents from the Index, it applies filters like Panda and others spam filters, but also less considered ones as the Safe-Search filter and the often forgotten Private Search layer (personalization).
  4. Ranking. Google applies in this moment the X number of ranking factors, not before. And the ranking factors should be considered and counted for every kind of index Google has:
    • Universal search
    • Image search
    • Local search
    • etc.
We should not forget, then, that content and layout composing the SERPs depend a lot on things like the device used.

The Unbearable Lightness of SEOs

SEOs are talented professionals with a natural tendency to develop a manic-depressive psyche
Ok, I have exaggerated a little bit, but—and I am an SEO, too—we live moments of pure joy when we see that our work is making the organic traffic of a site rise up and to the right, but also sudden dark periods of (unconscious?) anxiety when Google announces an update or we see a small traffic drop.
For that reason, we love ranking factor lists.
We need them not just as a potential source of information, but because they reassure us, too.
And we love them even if they are just a sequence of myths.
Let's take, for example, " Google's 200 Ranking Factors,", published by Backlinko, which I use for no other reason than it being the most recent successful list published.
I'll start with an easy one:

1 - Keyword Density [Ranking Factor 17]

My eyes bleed reading that although not as important as it once was, keyword density is still something Google uses to determine the topic of a webpage
Keyword Density never was a Google's ranking factor. Never.
If we really want to find keyword density as factor for ranking, we must go back to the 70s and 80s and look at  what Stephen E. Robertson, Karen Spärck Jones, and others described as the  Okapi BM25 formula.
If keyword density ever had some relevancy as a ranking factor, it was in the Pleistocene era of search engines.
We live in 2014 and Google just had its 16th birthday.
It is still obviously important having the keyword we want to rank for in the text of a web document.
However we also know that it is also possible to make our site ranking for that keyword without having it at all in the page, if Google finds enough consistent and relevant external signals, which associate that keyword to our site.

2 - LSI [Ranking Factors 18/19]

For this example I will cite what Bill Slawski wrote in this  Inbound.org thread:
Latent Semantic indexing was invented and patented in 1990, before there was a web. 
It was developed to help index small (less than 10,000 documents) databases of documents that didn't change much (like the Web does). 
There have been a number of companies that started selling LSI Keyword generation tools that promised that they could help identify synonyms and words with the same or similar meaning. 
Where those fail is that the LSI process requires access to the database (of documents) in question to calculate which words are synonyms - and the only people with access to Google's database to do that kind of analysis (which isn't possible anyway since Google's index is much to big and changes much to frequently) is Google.

3 - YouTube [Ranking Factor 76]

There's no doubt that YouTube videos are given preferential treatment in the SERP .
How can be this a ranking factor? Eventually it is a monopolistic use Google does of its own search engine, but a ranking factor?
This is a classic example of how t hese lists tend to be everything but scientific, hence unreliable if not even dangerous.

4 - Site Uptime [Ranking Factor 69]

What Brian says is correct: if Google, despite of several attempts, see that a site returns a 500 server response, then that site will start being pushed out of the SERPs.
Correct, but in this case we are talking about an Indexing issue caused by a Crawling problem, not a Ranking one. As I wrote before, meaning is important.

5 - Keyword as first word in domain name [Ranking Factor 3]

The ranking factor list includes this factor because in 2011 a panel of SEOs (myself included) considered that EMDs and PMDs were clearly having an advantage in terms of rankings, and so declared it in the Moz  Search Ranking Factors Survey.
In 2013 Moz published a  new edition of that survey, and the opinions the same SEOs had were quite different.
The most important thing, though, is understanding that these were just opinions from SEOs; they should be considered (with all the disclaimers) possible, but based more on personal experiences.
Any opinions, although authoritative, are just opinions and not science, let alone ranking factors.

6 - Country TLD Extension [Ranking Factor 10]

It is true that cTLDs offer a stronger geo-targeting indication to Google than geo-targeted subfolders and subdomains. 
However, as any international SEO can confirm, a web site with a cTLD domain termination does not necessarily rank better than a generic domain name.
What is not so true, then, is that an .es or .it web site cannot rank well outside of Google.es or Google.it. In this post I wrote last spring on  State of Digital, I presented many examples where sites with "Latin American" cTLDs were outranking .es ones in Google.es. In the comments to the posts, then, you can see that this is something common in every regional version of Google.
This "ranking factor" is a clear example of how these kind of lists may mix correct information with dangerous ignorance. (I am using "ignorance" in its real meaning as "lack of knowledge or information on a given subject," in this case international SEO, and not in its pejorative sense.)

7 - Use of Google Analytics and Google Webmaster Tools [Ranking Factor 78]

How can something described this way be a ranking factor?
"Some think that having these two programs installed on your site can improve your page's indexing. They may also directly influence rank by giving Google more data to work with..."
"Some think?" Who? The university student ranting in a forum? A information technologist? An insider in Mountain View? This is purely speculation.

8 - Guest Posts [Ranking Factor 91]

When we talk about how dangerous doing some kinds of guest posting can be, we are talking about web spam.
Therefore, if a link (or a series of links) from guest posts are considered as having a manipulative nature, we should talk about "Spam Filters" (3rd Stage of Search) and not actual ranking.
Again, meaning is important.

9 - Facebook Likes and Facebook Shares [Ranking Factor 157/158]

Google cannot see likes and Facebook shares. So they cannot be a ranking factor. Period.
Matt Cutts, in the same SMX panel the list cites as its source, said:
We like standards that are available on the open web. If we're not able to crawl something – like Facebook or like the time we temporarily ran into problems with Twitter – we don't want to depend on that data.
The biggest mistake here, though, is confusing causation with correlation, and the power of Social Signals is a correlation power.
As I wrote a week ago in a comment to the  Marcus Tober post here on Moz, social shares are not a direct cause of good rankings, but they may help in obtaining them:
Social shares > higher visibility > creation of 2nd tier backlinks (e.g. on Topsy) and improved opportunities of earning natural backlinks from people who discovered that shared content.

10 - Employees listed in LinkedIn [Factor 171]

Here, we are at the limits of the absurd.
Backlinko defines this as a branding signal. The problem is that a branding signal is not a ranking signal.
It cites an old post—a very good one— that Rand Fishkin wrote back in 2011. Unfortunately, that post was saying something completely different. Rand exposed his (correct) hypothesis that, in the future, Google would start looking at "branding" signals in order to create named entities able to reflect the offline relevancy of an online presence. 
In that post, Rand never cited the "Employees listed in LinkedIn" as a factor.

I could continue, but it is not my intention to write a full rebuttal post.
No, my intention is to make clear—especially to you, young SEOs—that nothing good can come of your taking these lists at their word.
My intention is to exhort people not to create them. 
What could seem like a good link-bait idea (and the performance of Brian's post is proof that it can be) ends up being something that spreads a fallacious vision of SEO, which will reach the eyes and minds of a mainstream audience of non-SEOs: businesses' owners and marketing executives, who will see the list republished in sites like Hubspot or Entrepreneur.

Are all Google ranking factor lists bad?

No.
We can find serious studies, which aim to understand why certain sites ranks better than others. The Moz Search Ranking Factor Survey cited before, and the Searchmetrics Ranking Factors study are the most shining examples of that.
Nevertheless, there exists a huge difference between those studies and a simple infographic/post listing the supposed 200 ranking factors: they are correlation studies executed following a solid scientific method.
Be aware that they are correlation studies; hence, they are just telling us what common characteristics the sites that are ranking high in the SERPs have. 
Use them as inspiration for best practices to follow if they really are applicable to your site, nothing else.
You can even try to create a ranking list without doing a correlation analysis, but that work should meet three criteria:
  1. It should be at least as good as the Periodic Table of SEO Success that Search Engine Land presents in its site;
  2. It should be based on deep knowledge of how search engines' work; and
  3. It should always present a disclaimer about its subjective nature.
Finally, instead of searching for lists, the best idea I can offer you is to experiment yourself. Create a site, test theories, try to break the rules for understanding how Google is possibly working.
And if you feel you cannot do that alone, then consider  joining the IMEC Lab that Rand created a few months ago.
Happy testing!

No comments:

Post a Comment