Helping Googlebot To Help Your SEO

Helping Googlebot To Help Your SEO


Hi. Today we’re going to be talking about
helping Googlebot to help improve your SEO results. You may think that you want Google
to come and crawl and index every page of your website. However, there’s a thing called
crawl budget, which is a finite amount of resources that Googlebot has to crawl and
index your website every time it visits it. Crawl budget varies by site to site, and this
is going to be based on your site’s strength. There is some research out on the Web that
tries to identify the fact: Does that affect the crawl budget assigned to each website? An example of this may be the quality of your
backlinks. If Google sees that your site has got good quality backlinks, then it may assign
a better crawl budget to your site as opposed to a site that doesn’t have good quality backlinks,
because it’s going to try and index more of your content because it thinks it’s going
to be more useful to the users that are searching for that content. Why should you control access? Basically,
you’re giving your important pages of your site priority when it comes to Googlebot crawling
those pages. An example of these may be your product pages, service pages, blog posts,
and even your contact details page as well, because people are going to be looking for
that in Google. You’re going to want to ignore or prevent
pages that don’t need to be ranked, such as privacy policy, terms and conditions, and
blog category pages maybe because they’re just a means to get to the blog posts. There
is no real useful content on there that needs to be ranked in Google. We’re kind of making the most of this crawl
budget by telling Google what pages are going to be the ones that we really want to be crawled
every time and ignore the ones that aren’t so good. So how can we do that? First thing to look
at is the robots.txt file. You should make sure that you have a robots.txt file in the
root of your website. This is simple text file script that tells Googlebot what areas
of your site to not crawl and which areas to crawl. This may be pages, even folders
and file types as well. So if there’s like PDF files on your website that you don’t need
to be indexed, you can put that in the robots.txt file. When you are working with this, though, make
sure you go to Google Search Console’s robots.txt file testing tool just to make sure you haven’t
got any rules in there that could accidentally deindex your whole website. So be careful
with that one. If Googlebot comes to your website from an
external link, for example it comes straight to a page that somebody has linked to, it
may not take into account robots.txt file rules, in which case you’re probably going
to want to back up, ignoring the pages that aren’t good by adding noindex tags to the
actual header code of the specific pages. This is used for telling Google that it shouldn’t
be in the index. Of course, it’s kind of come to the page already and it’s tried to craw
it and index it. But you’re just specifying that you don’t want it in the index, and then
next time it probably won’t attempt to come back to that page. Again, in Google Search Console, there is
a testing tool you can use. If you go Fetch as Google, you can see if that noindex tool
script is actually working. URL parameter rules, this is something you
can set in Google Search Console as well. You can only do it in there. Basically, it’s
a really powerful way of telling Google about dynamically generated URLs that may be duplicates
of normal URLs of your website. So if you’ve got a CMS system or an ecommerce system that
you use on your domain, chances are it’s probably generating these dynamic URLs that just reorder,
sort, and narrow content. In this tool, you can tell Google what are
duplicates based on what actions that are being done on those pages. So you can help
prevent a lot of wasted time on those duplicate pages and only focus on the ones that need
to be indexed. Be careful with that tool, because again there’s
potential to deindex your whole website if you’re not careful with what rules you set
for the different parameters. So you need to understand what the parameters are and
make sure there are no issues with that. Keep an up to date XML sitemap. Although Google
doesn’t live by the rules of what’s in your sitemap, it won’t go and specifically index
every page that you specify in your XML sitemap, it does give it hints to what content it should
be indexing. So make sure you’ve got your new pages in there, and you’ve got old pages
that don’t exist that are taken out of the XML sitemap, because if it tries to follow
these links and then gets a 404 page, that’s a bit of the crawl budget that’s wasted. So moving on to fixing broken internal links,
again if it’s following broken links within your website to other pages of your website
and those pages don’t exist, it’s wasted that crawl budget. Use a tool such as Screaming
Frog to crawl your website and identify broken links and fix them at source. This kind of plays well into site structure
as well. Having a good site structure is a really underrated way of controlling users
and search bots that come to visit your website to find pages that matter. So if you’ve got
an important page that’s three or four levels deep, hidden within navigation, chances are
the crawl budget may be used up and users won’t be able to find it because it’s hidden
away. Plan a good site structure and move pages
around. Put your important pages top level or second folder down in the URL structure.
Have them in your main navigation. Make sure that they’re crawlable through links that
are easy to find on your page. The last one I’ve got here is page load times.
As you can imagine, the faster your page loads, the more pages Google can get through within
this allocated crawl budget. There’s a lot of blog posts and tools out there to help
you identify ways to speed up your page load time. If you do it across your website, you
can really make a big difference to how many pages are being crawled and indexed within
that crawl budget. So there we have kind of a why concentrate
on this. I think it’s really important and an underrated way of boosting your crawl and
indexation and even rankings in Google as well. Here are some ideas on how you can control
that. So I recommend giving that a go. If you’ve
got any questions, contact me on Twitter @Koozai_Dean, or just get in contact with the Koozai sales
team, and we’ll be happy to help you. Thanks.

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *