Valid (X)HTML code and SEO

I’ve had several discussions about creating websites with (X)HTML that validates according to W3C’s specifications. This discussion very often arise when SEO is discussed and I’ve often been thinking about the motives of the people who claims that a site with a 100% valid code is better optimized for search engines than website that do not validate. I’ve never heard any good argumentation for this. It just seems that some xhtml evangelists like to decorate themselves with a shiny “Valid XHTML”-bagde.

It has been done quite a lot of testing on valid vs. not valid code and Google and here are two of them:

1 – “Google actively ranked the websites with invalid HTML higher than the websites with valid HTML. Google even refused to rank one of the valid HTML websites altogether.”

2 – Well – the result is clear. From these 4 pages Google managed to pick the page with valid css and valid html as the preffered page to include in it’s index!”

This two tests points in two different directions. It looks like Google is pretty much unpredictable when it comes to W3C compliance.

Let’s check what Google thinks about valid code on their own website? Holy crap! 51 errors on a website with such a simple layout! Doesn’t look like Google care much about W3C validation at all!

Philipp Lenssen interviewed Matt Cutts at Googleplex:

“In more general terms, what do you think is the relationship between Google and the W3C? Do you think it would be important for Google to e.g. be concerned about valid HTML?

I like the W3C a lot; if they didn’t exist, someone would have to invent them. :) People sometimes ask whether Google should boost (or penalize) for valid (or invalid) HTML. There are plenty of clean, perfectly validating sites, but also lots of good information on sloppy, hand-coded pages that don’t validate. Google’s home page doesn’t validate and that’s mostly by design to save precious bytes. Will the world end because Google doesn’t put quotes around color attributes? No, and it makes the page load faster. :) Eric Brewer wrote a page while at Inktomi that claimed 40% of HTML pages had syntax errors. We can’t throw out 40% of the web on the principle that sites should validate; we have to take the web as it is and try to make it useful to searchers, so Google’s index parsing is pretty forgiving.”

(the whole article is here)

Please understand me correctly, I do mean and has always meant that valid code is what we should strive for, but let’s not get nazi about it. I think that this quote sums it all up pretty well:

“Creating valid templates should be a path, not a goal. The idea is to make your template as accessible as possible for humans and spiders, not to achieve a badge of valid markup. “

Tags: , , , , , ,

Leave a Reply