5 Common Crawlability Mistakes That Kill Your SEO Success


Small Business Search Marketing

submit_url = “http://www.smallbusinesssem.com/5-common-crawlability-mistakes/1198/”;

SEO Success PyramidIn my opinion, the nuts-and-bolts of SEO can generally be boiled down to three primary parts: Crawlability, Content, and Links. These three things make up the middle row of the SEO Success Pyramid, and they’re an absolute must as you work your way up the pyramid to becoming a trusted site.

Search engine spiders/bots aren’t all that intelligent. If a spider can’t find content (because of a broken link, for example), it’s not programmed to stop what it’s doing and go looking around for that great article you wrote. It’s going to move on to the next link and keep crawling, crawling, and crawling. That’s what it does.

It’s common sense: If a spider can’t access your content, your content won’t be indexed and will never be found in search engines. That’s why crawlability is a foundational element of SEO and the SEO Success Pyramid.

5 Common Crawlability Barriers

1.) You screwed up the robots.txt file.

If you’re like me, you roll your eyes every time you hear or read someone talking about this, right? I mean, really, who still screws up their robots.txt file? Search marketers and others have been banging this drum so long, you’d think it doesn’t need to be said anymore.

Well, well, well … have a look at what I found last week on Yahoo Answers:

robots.txt question on Yahoo Answers

Apparently, we do still need to bang the drum: Be careful with your robots.txt files. It’s the first thing to check when you think you have crawlability issues. You can learn everything you need to know at www.robotstxt.org.

2.) Too many variables/parameters in your URLs

Search engines are getting better at crawling long, ugly links — but they still don’t like them. Google’s webmaster guidelines explain it in plain English:

If you decide to use dynamic pages (i.e., the URL contains a “?” character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few.

(Bonus: Short URLs also get clicked on more often in the SERPs. They’re good for crawlability and clickability.)

3.) Session IDs in your URLs

Search engine spiders flat-out do not like to see session IDs in your URLs. If you’re using session IDs on your site, be sure to store them in cookies (which spiders don’t accept) instead of including them as part of your URLs. Session IDs can cause a single page of content to be visible at multiple URLs, and that would just clog up the SERPs. So, search engines don’t like to crawl URLs with session IDs.

4.) Your site suffers from code bloat.

Code bloat is one of those things that isn’t really a problem … until it’s a Big Problem. Spiders are generally good at separating code from content, but that doesn’t mean you should make it more difficult by having so much code that the content is hard to find. If you look at the source code of your web pages, and finding the content is like looking for the proverbial needle-in-a-haystack, you may have crawlability problems. As Stoney deGeyter recently said on Search Engine Guide, “I do believe that if you have so much code on your pages that it makes it hard to dig out the content, then you might have some issues.” I agree.

5.) Your navigation and internal linking is coded poorly.

Designers and developers can be pretty creative when building a web site. Sometimes that creativity comes out in the form of site navigation that’s built in complicated DHTML or javascript code. Sometimes that creativity comes out in the form of a Flash- or Ajax-based navigation, where what we think of as web pages aren’t really web pages at all. This kind of design and implementation can stop a crawler in its tracks. Google talked about crawlability problems with flash, ajax, and javascript in late 2007:

“One of the main issues with Ajax sites is that while Googlebot is great at following and understanding the structure of HTML links, it can have a difficult time finding its way around sites which use JavaScript for navigation. While we are working to better understand JavaScript, your best bet for creating a site that’s crawlable by Google and other search engines is to provide HTML links to your content.”


Crawlability is often overlooked in the name of creativity and coding, but it’s as important to your SEO efforts as content development, link building, and any other element of the SEO Success Pyramid. Ignore it at your own risk.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: