I’ve had several discussions about creating websites with (X)HTML that validates according to W3C’s specifications. This discussion very often arise when SEO is discussed and I’ve often been thinking about the motives of the people who claims that a site with a 100% valid code is better optimized for search engines than website that do not validate. I’ve never heard any good argumentation for this. It just seems that some xhtml evangelists like to decorate themselves with a shiny “Valid XHTML”-bagde.
It has been done quite a lot of testing on valid vs. not valid code and Google and here are two of them:
This two tests points in two different directions. It looks like Google is pretty much unpredictable when it comes to W3C compliance.
Let’s check what Google thinks about valid code on their own website? Holy crap! 51 errors on a website with such a simple layout! Doesn’t look like Google care much about W3C validation at all!
Philipp Lenssen interviewed Matt Cutts at Googleplex:
“In more general terms, what do you think is the relationship between Google and the W3C? Do you think it would be important for Google to e.g. be concerned about valid HTML?
I like the W3C a lot; if they didn’t exist, someone would have to invent them.
People sometimes ask whether Google should boost (or penalize) for valid (or invalid) HTML. There are plenty of clean, perfectly validating sites, but also lots of good information on sloppy, hand-coded pages that don’t validate. Google’s home page doesn’t validate and that’s mostly by design to save precious bytes. Will the world end because Google doesn’t put quotes around color attributes? No, and it makes the page load faster.
Eric Brewer wrote a page while at Inktomi that claimed 40% of HTML pages had syntax errors. We can’t throw out 40% of the web on the principle that sites should validate; we have to take the web as it is and try to make it useful to searchers, so Google’s index parsing is pretty forgiving.”
(the whole article is here)
Please understand me correctly, I do mean and has always meant that valid code is what we should strive for, but let’s not get nazi about it. I think that this quote sums it all up pretty well:
