Internet Marketing Coach
Internal Linking - META nofollow, rel nofollow, robots.txt Confusion
February 27th, 2008 | 31 comments
Those of you who have been following this blog for a while know that I’m big on developing good internal linking structures. A good internal linking structure can not only increase the number of pages your site/blog has indexed, but it can actually increase the authority certain pages receive.
There are 3 “controllers” when it comes to this; using the META noindex tags and it’s various forms, using a robots.txt file and using the rel=”nofollow” attribute within links.
In recent months there has been alot of confusion as to what each of these controls actually do. Does a robots.txt exclusion restrict Google from indexing a page, does it prevent it from receiving “link juice,” does it still show up in the SERPS? Same for the META nofollow tag and rel=”nofollow”??
How exactly are each of these treated by Google? In this article we’re going to dive into each of these bot controllers and try to eliminate some of the confusion.
First I’ll explain what each one does and then I’ll go over the best way to control your internal linking.
The robots.txt File
The robots.txt file is a simple txt file that you can create with any basic text editor like notepad, wordpad etc. It is uploaded into the root of a site and named robots.txt For a more in-depth explination, refer to my robots.txt guide.
According to Matt Cutts, who was interviewed by Eric Enge which was published by Andy Beard, robots.txt will prevent the GoogleBot from crawling any page that is restricted within. However, these pages can still obtain PageRank and they can still be returned in the SERPS.
What’s that mean? If you restrict a page with robots.txt it only means that Google won’t read its content or follow links within the page(s.) That tells me that although the page(s) can still receive PageRank, that PR will not be distributed to the links within the page, as it would if it weren’t restricted. This is why Google wants those who sell links to restrict sponsored posts and other pages that have links which were sold through robots.txt.
This also tells me that it is NOT a good way to control your own internal linking structure in most cases because if there is even one other link on the WWW linking to that page, it will still accrue PR and it will still rank.
The META noindex tag
There are actually several forms of this META tag, but we’ll just talk about the most important.
- “NOINDEX” same as “NOINDEX, FOLLOW”
This tells Google not to index the page, but it still crawls the page and links are still followed.
- “NOFOLLOW” same as “INDEX, NOFOLLOW”
This tells Google to index this page, but to ignore the outgoing links.
- CONTENT=”NOINDEX, NOFOLLOW”
This tells Google not to index the page or follow the links contained within.
This is a very basic explanation of these tags, excluding whether or not they receive and/or pass PageRank because to be honest with you it’s damn confusing to me too. However, if you follow my advice below, it will not matter.
The rel=”nofollow” attribute
This attribute, when inserted into links, tells google to ignore the link. But, if the page that link points to is linked to from another page on the WWW it will still be indexed, crawled and assigned authority.
The Best Way to Control your Internal Linking
You can do pretty well by simply using the rel=”nofollow” attribute in many cases, but to be absolutely sure and to have 100% control you need to use a combination of robots.txt, META nofollow tags and rel=”nofollow”
Not using any one of those controllers can prevent your site from having the best possible internal linking structure!
photo creadits: bogdan.glushak
Popularity: 25% [?]
Related Posts
- What’s That Link Really Worth?
- I’ve Changed The DoFollow Comment Policy
- Free Blog Review #3 - ThomasSinfield dot com
- Internal Linking and PageRank Leakage
- Does Your Site and/or Blog Need a Sitemap?
- An In-Depth robots.txt Guide
- Why you ShouldN’T Nofollow your blog comments
- No nofollow is Back with new Blog Theme
- 6 Underutilized Ways to Increase SE Rankings
- Duplicate Content - It May Not be so Bad















Twitter
Linkedin
27th February, 2008 at 5:12 pm
Good post Josh, I think there is a lot of confusion on the matter of linking, many don’t know what should be nofollow, noindex etc.
27th February, 2008 at 5:55 pm
Thanks John, hopefully this helps clear up the confusion.
27th February, 2008 at 6:03 pm
Did the same kind of analysis myself a while back Josh and I came to the conclusion that the only way to avoid indexing at all was password protect individual pages or directories. I’ve been stunned quite often by just what the Google spider in particular does pick up - recent search terms, author profiles etc. - when it seems to have greater difficulty with actual articles!
Well put together Josh even if the conclusion has to be “Well, actually…”
27th February, 2008 at 6:43 pm
Hey Chris,
I’ve had pretty good luck restricting pages through a combination of the three. Actually, so far it’s worked everytime. When I do it those pages don’t get indexed, assigned PR or crawled in any way.
Password protecting them I’m sure would be a full-proof way as well, but then users can’t access them.
Thanks for the compliment, greatly appreciated!
27th February, 2008 at 8:27 pm
Hi Josh
I’ve got your Article Marketing Domination and have implemented much of what you suggest to great effect.
I must admit the internal linking section went over my head and I ignored it except to change the link names from “home” to the relevant key words.
Unfortunately even though i have read your post a number of times I still haven’t fully grasped what you are suggesting. I think I am probably making it out to be more difficult than it is.
The 2 areas I am confused about are:
1) What criteria do you use to decide which pages to restrict?
2) I can’t visualise the linking structure. Is there any way to this in a diagram?
Sorry if I’ being a bit thick but as my grandfather used to say:
“If you ask a question, you might appear stupid for a couple of minutes. If you never ask a question you’ll remain stupud the rest of your life!”
Thanks for your blogs. I have learnt alot from you.
Paul
27th February, 2008 at 8:37 pm
Hi Paul,
It’s much easier once you understand it. It’s one of those things that just hits you
1. Which pages will make you money either directly or indirectly? Your contact me page won’t, about me won’t, etc. Which pages are duplicated? For instance, are there more than one URL with the same content? There’s no need in sending PR (authority) to both of those pages. Better to isolate it.
2. I don’t have one on hand, but that’s a great idea. I need to figure out how to make one first (extremely technically challenged) but I’ll definitely look into it and post some on the blog when I do.
27th February, 2008 at 10:34 pm
Thanks Josh,
I like your writing - clear and straightforward.
27th February, 2008 at 10:35 pm
Thanks, Evan. I guess it’s just a god-given talent
jk
28th February, 2008 at 12:14 am
Josh,
Good recap on the no follow scenario. I found this out about 8 months ago when I watched a video from Stompernet. Didn’t realize that you could do this, but quickly implemented it on my ecommerce site and saw some great results. Afterall, who wants to get a PR 3 for a contacts page? LOL!
28th February, 2008 at 12:56 am
Josh — great post. Based on what you are saying, deep linking back into your own sight should always be “rel=nofollow”, right?
Also, have you looked at what the All-In-One-SEO plugin for WordPress does? Do they do it right? Do you use such a tool?
28th February, 2008 at 1:01 am
@ Nick - Yep, I just started testing things and playing around with it a few months on the advice of Andy Beard. It was definitely a smart move.
@ Mark - Thanks, appreciate it. No, that would be a bad idea.
You want to nofollow links that go to useless pages like contact pages, about pages, duplicate pages, but you definitely don’t want to nofollow good, money pages.
I’ve heard alot about the all-in-one seo plugin, but to be honest with you I forget all of it’s features. I’ll have to look into that one and get back to you. Just open a ticket at askjoshspaulding.com and I’ll reply with info on that.
28th February, 2008 at 11:29 am
Great article. There are a couple of things that I need to have no follow on.
Here is another question that i have that is slightly off topic but till dealing with meta tags.
Are meta keywords any good anymore? I have done an analysis on my new site and find that the keyword phrases use by visitors have nothing to do with what I have put in my keyword metatag. I have 8 phrases ranking in google in positions 4 or above out of 17 phrases. All of these key word phrases only contain one word in my metatag.
Should one change their metatag to reflect terms that the public are using?
You do a great job in explaining things that a very complicated for dummies like me.
28th February, 2008 at 1:48 pm
Thanks Josh,
Internal linking now makes sense to me….
Just to check I am on right track…. it can also mean that if i am doing a content based site with a lot of pages(money generating pages), it could be a liability in terms of link juice getting distributed across all those pages. Thus, mandating the acrual of more outbound links from authority sites to keep my PR high?
28th February, 2008 at 4:26 pm
@ Neil - The META keyword tag is 100% useless with Google and most of the other top SE’s, as they don’t even look at it and 95% useless with the SE’s who do look at it. You could remove your META keyword tag and I’d imagine you wouldn’t lose any traffic or exposure at all.
The META description on the other hand is pretty important. Be sure make that enticing to the visitor and include your main keyword in there.
Thanks for the compliment
@ Ajith - Great, I’m gad!
Yep, you need to distribute your link-juice through your site and that deep linking and continuing to build links.
28th February, 2008 at 5:34 pm
Thank you for that. Do you think that tag:
has any relevance?
28th February, 2008 at 6:23 pm
HI Ksenija - WordPress converts tags into html inside comments, so you’ll have to remove the < and > from the tag for it to show up.
28th February, 2008 at 9:04 pm
Sorry, I was talking about…
meta name=”revisit-after” content=”7 days”
29th February, 2008 at 4:53 am
Hi Ksenja,
No problem. That META is worthless. The SE’s will revisit when they choose.
btw, to all others, if your comments are missing I apologize. I just switched servers and unfortunately lost about 5 hours of data.
1st March, 2008 at 10:45 am
Josh,
That was an eye opener! I have to tinker with my site more often to get the above in order!
Giving rel no follow to about me etc makes sense!
You mean to say that my home page PR will increase if i were to use the rel no follow attribute selectively?
Thanks!
Abhishek.
2nd March, 2008 at 3:58 pm
@ Abhishek - Glad you got something from it!
It can increase the authority going to your index and other pages, as you’ll be retaining “juice” that would have otherwise been wasted and you’ll be putting it back into your own pages, helping them ranking better.
2nd March, 2008 at 6:40 pm
[...] Internal Linking: META nofollow, rel nofollow, robots.txt Confusion - Another highly useful post by SEO expert Josh Spaulding that explains a bunch of technical stuff that I really should know more about! Great stuff Josh. [...]
3rd March, 2008 at 8:59 am
[...] Internal linking: META nofollow, rel nofollow, robots.txt confusion - Highly informative post, must check for all the bloggers. [...]
4th March, 2008 at 9:06 pm
Thanks Josh.
5th March, 2008 at 2:42 am
[...] 1. Internal Linking – META nofollow, rel nofollow, robots.txt Confusion. [...]
5th March, 2008 at 3:29 pm
Very Informative. Bookmarked!
nhick
10th March, 2008 at 3:53 pm
Josh, I know this is probably more code related, but do you know how to get the META tags inserted in the a blog’s theme? I’m using free themes right now, and I don’t see anywhere where I can define the META tags. If I’m on the front page and I view the page source, I don’t see but one META tag and it’s not very useful. So any pointers on this would be greatly appreciated.
Thanks
Charlie.
10th March, 2008 at 3:59 pm
Hi Charlie,
Sure, there are several plugins out there that do this. The dd-Meta-Tags and the All-in-one SEO Pack come to mind right away.
19th March, 2008 at 12:14 am
[...] His a trick he expounds on in this blog post “Internal Linking - META nofollow, rel no folow, robots.txt and Confusion” He implores us to use the rel=”nofollow” atribute in all links that don’t need [...]
6th April, 2008 at 5:37 pm
Hi Josh,
Long time no see. Great post. It’s made me realise I’m doing things wrong. I’ve got the noindex,follow meta tag on my category, tag and date archive pages (courtesy of the All In One SEO plugin), but I’m also blocking them in robots.txt.
I guess that means I’m missing out on the PR that could be passed by the category page. I’m going to remove the blocks in robots.txt and just go with the meta tags.
Out of interest, why do you no follow the link on your own name in the comments? Is it because you want Google to associate the term “Internet Marketing Blog” with the site and don’t want the term “Josh Spaulding” to dilute that?
6th April, 2008 at 6:05 pm
Umm, actually, I jumped the gun. When I went to edit my robots.txt, I found that I’m only disallowing the following:
Disallow: */trackback*
Disallow: /wp-*
Disallow: */feed/*
I guess that’s fine. I’m a bit surprised though, I thought I was blocking categories too.
8th April, 2008 at 3:02 pm
Hey Stephen, nice to see you around
Just be careful and don’t blog things that shouldn’t be blocked, so you still have a good internal linking structure.
I nofollow my name because I normally leave several comments on every single post on this blog, so it’s just kind of pointless to redirect link juice back to the main directory through my name as the anchor text. I’d rather that juice flow into my internal pages etc.