How to Find and Remove Broken Links in Your Website

Mar 23 2010 by Omer Greenwald | 72 Comments

How to Find and Remove Broken Links in Your Website

Broken links are links that lead to pages that do not exist. When clicking on a broken link, the page you land on is called a 404 error page, a standard HTTP response that indicates that the requested URL doesn’t exist.

What do you do when you happily surf the web and suddenly come across a 404 error? For most of us, the immediate response would be to simply leave the current site in favor of another one because both people and search engines consider broken links as unprofessional.

404 errors and broken links also have negative effects on your search engine rankings so it is quite reasonable to be proactive in avoiding them to improve exposure and increase site traffic.

Note: there are terms and methods presented in this tutorial that address WordPress bloggers, however, this article is relevant to any website owner.

Bloggers update their blog’s content more often than other site owners do. Therefore, there is a higher chance for them to have broken links found throughout their website.

For WordPress blogs, there are two types of plugins that can be used to deal with those links:

  1. Plugins that detect broken links in your site like Broken Links Checker.
  2. Plugins that manage 301 redirects  automatically like Redirection.

As a plugin minimalist, I always insist on doing things manually to avoid using and installing plugins. In this case, you can be rest assured that having the ability to deal with these problems efficiently is worth installing another plugin. (See the short list of plugins that Six Revisions uses.)

Whether you use a plugin or not, I highly recommend checking your website occasionally for broken links and 404 errors.

deadend

Detecting and removing invalid URLs using Google WebMaster Tools

There are two reasons why pages are indexed in Google even though they don’t exist in your website:

  1. You referenced an invalid internal link by mistake because of a typo. This is the time to recommend a very simple yet essential plugin for post authors – the Link to Post plugin for avoiding such mistakes).
  2. You published a post and decided to change its permalink afterwards (the post’s URL) after Google has already indexed the original link.

The best way to detect these errors is by using Google Webmaster Tools. If you haven’t done so already, register your site there. It’s an essential tool to have for anyone running a website.

One of the most important tools provided in GWT is the Remove URL tool, which allows you to remove invalid pages from Google search results.

Let’s see how to remove those bad URLs from Google’s index.

Detecting invalid pages that are indexed by Google

Once you have signed up for Google Webmaster Tools and have set it up and verified your site (see Google’s Getting Started guide for GWT), here is the process for finding invalid pages.

1 Click on Diagnostics from the left menu and select Crawl Errors.

2 Select the Not Found category.

3 If available, click to view which page contains the broken link.

4 To make sure that the URL is indeed indexed in Google, copy and paste the URL into Google’s search and see if any result comes up.

Detecting invalid pages that are indexed by Google

Removing URLs from Google’s search results

1 Click on Site Configuration, and select Crawl Access.

2 Select Remove URL.

3 Click on New removal request.

Removing URLs from Google's search results

4 Select the first option to completely remove a page from Google search results.

OR

Select the fourth option in case you want to retain the page in search results but would like to remove the cached version of the page. This is useful in case Google displays an outdated version of the page in the "cached" link.

5 Click on Next.

Removing URLs from Google's search results

6 Type the URL of the page you would like to remove from Google.

7 Make sure the first option is selected and then click on Add.

8 The URL to be removed should now appear in the list. If you want, you can add more pages for removal.

9 Click on Submit Removal Request.

Removing URLs from Google's search results

Your request is now pending—in most cases, it only takes 2 to 3 days for Google to remove the URL.

10 Click on Site Configuration and select Crawl Access.

11 Select Remove URL.

12 Make sure the URL you requested to remove does not appear in the list of pending requests.

13 Click on Removed URLs to see that the URL is now listed there.

To make sure that the URL was indeed removed from Google, copy and paste the URL into Google’s search box and see if any result comes up.

Removing URLs from Google's search results

Detecting Broken Links using Xenu Link Sleuth

Another excellent tool I like using for hunting down broken links in my websites is Xenu Link Sleuth. You can download it here.

Unlike the WordPress plugins mentioned earlier in this article, Xenu is a standalone desktop application for Windows that outputs all your site links—whether they’re valid or invalid links—and groups them into a very readable fashion.

Removing URLs from Google's search results

After you install Xenu, using it is really easy.

1 Click on File and select Check URL.

2 Type your website’s URL (e.g. http://www.sixrevisions.com/).

3 Wait for all links (site wide!) to be checked.

4 When Xenu asks whether you want a report, click on Yes.

Removing URLs from Google's search results

You can enter your FTP server details, but I simply click on Cancel and Xenu generates an XHTML report locally (it opens a dialog window automatically).

In the generated report, click on Broken links. Sort it by link in the table of contents to see all the pages that have broken links in them (and the broken links of course).

Finally, once you’ve detected all broken links, what is left to do is to navigate to the posts and pages containing references to broken links. You should either fix or remove those links.

What are your own techniques and tools for finding broken links? How often do you search your site for broken links? How important is it to eliminate invalid links, and why?

Related Content

About the Author

Omer Greenwald is a web developer, blogger and WordPress enthusiast. He specializes in HTML, CSS, JavaScript and PHP. You can check out his blog called WebTechWise, which shares tips and tutorials for bloggers and site administrators. To connect with the author follow him via Twitter as @WebTechWise.

72 Comments

Cheryl

March 23rd, 2010

Beautiful article. You mention Xenu (which I’ve used in Windows environments). If you’re on Mac, the wonderful “Integrity” (shareware: remember to donate) is just the trick.
http://peacockmedia.co.uk/integrity/
Cheers.

krike

March 23rd, 2010

I never search for broken links… untill now :D this is awesome :D exactly what I needed :) thanks

adone

March 23rd, 2010

Very good article. Thank you.

The Inside Design

March 23rd, 2010

Great article and very help for removing broken links on a website that show up from time to time. There are also plugins for example for WordPress that make it easier to remove and/or redirect them as well.

Michael

March 23rd, 2010

good stuff, thanks for sharing, but to use the acronym GWT for Google Webmaster Tools is really misleading since GWT is widely known to refer to the Google Web Toolkit.

bbrian017

March 23rd, 2010

My blogs are fine but my social network blog engage has many broken links. I’ve been way to lazy to fix it and now it’s simply out of control.

I wish there was an easier way to accomplish this task. how cam I humanly remove over 225 broken links form my website?

I guess I will have to dedicate one weekend to broken link removal! This new depresses me a little lol.

Omer Greenwald

March 23rd, 2010

@Cheryl – Thanks for sharing the alternative app for mac, I wasn’t sure which one competes with Xenu.

@krike, @adone – I’m glad you find this post useful.

@The Inside Design – You are right that there are some more excellent plugins for this tasks besides that two I have mentioned.

@Michael – I wasn’t aware that this acronym was already taken, thanks for pointing that out.

Anders

March 23rd, 2010

I use Xenu regularly to check for broken links, that’s not such a big problem. Broken links are quite easy to detect and fix, but how do you find links that although they work doesn’t point to “suitable content”, ie the page you linked to have now been replaced with something entirely different.

Demir

March 23rd, 2010

that’s the only thing what i need for my websites. thanks!

Zach

March 23rd, 2010

My only experience has been using Google Webmaster Tools so it’s nice to see some other options. Thanks for the good read!

Nick

March 23rd, 2010

Another great tool we use for broken link reporting after launching a new site is http://www.linkpatch.com. It emails you in real-time when someone gets a 404 error and sends all the details you need to fix it.

Beth McLain

March 23rd, 2010

Really Interesting Post. It will come handy, definitely need to bookmark.

Omer Greenwald

March 23rd, 2010

@bbrian – I’m not familiar with tools that automatically deal with broken links. I guess the reason is that a human sense is involved in the process of deciding what to do with each specific link (whether to remove it completely including the linking text, remove only the hyperlink, or change the destination of the link).

@Anders – You are right that Xenu does not detect links that lead to irrelevant pages, do you know an application that does that?

@Demir, @Beth – thanks for the good words.

@Zach – Xenu is great for double checking Google Webmaster Tools and tracking additional broken links.

@Nick – it seems like a decent tool, however, you need to embed a script in your site for that, right? Anyways, it might be worth it.

JC

March 23rd, 2010

Thanks for this lovely post on Broken Links.
Very helpful. Especially the clearity of using Google webmaster tool & suggestion of “standalone apps” for checking the broken links.

I would request sixrevisions to post something more on SEO related topics relevant to server side technology point of view.

Andy Forsberg

March 23rd, 2010

If you don’t want to install anything you can just run SpyderMate on your site for free (http://spydermate.com).

Melody

March 23rd, 2010

I had previously seen a few broken links but never knew how to correct it..Thanks, I will surely handle them now!

Duane Kinsey

March 23rd, 2010

Terrific article Omer. I have just checked webmaster tools for my site and found some broken links. Removed immediately.
Thanks for such an informative post.

MAK

March 24th, 2010

nice one omer thanks for the nice stuff

Anna

March 24th, 2010

Your information help me a lot. Good job.keep it up. Thanks!

Jacob Gube

March 24th, 2010

Thanks for everyone’s comments! Just a reminder: If you know of any good links and resources on this topic, please do share it here in the comments!

BeBeN

March 25th, 2010

like usually…usefully article
thanks ^_^

iamronel

March 27th, 2010

so awesome..thanks sir ill be looking for this post a long time..:)

Harsh Agrawal

March 28th, 2010

Very nice info…on WordPress I also prefer using broken link checker plugin

yoash

March 29th, 2010

ahh great! I just removed two 404′s via the method mentioned above. thank you!

Lana

March 30th, 2010

You may check for broken links here, too:
http://validator.w3.org/checklink

Omer Greenwald

March 31st, 2010

@Andy – That’s a nice tool. thanks for sharing
@Lana – W3 validator seems to check only a single URL/page instead of finding broken links site wide. thanks

NavaPavan

April 6th, 2010

Wow that’s amazing. Thanks for sharing.

rita morgan

April 10th, 2010

I’ve been looking at 15 Not Found link errors on my site for some time now and I finally know how to get rid of them. I really appreciate your help :-D

Jonathan Matthews

April 11th, 2010

Thanks for the interesting article. My company has a product (DeepTrawl) which makes link checking & fixing site-wide really easy. If you’d like a license to check it out properly please let me know.

abbas

May 1st, 2010

hi
thanks for this article.
there is any software for cheeck internal links?

Rieke

May 30th, 2010

Thanks for this very useful and nice tip!!

Marigold

May 31st, 2010

ladies and gentlemen, we have a winner here.

Michele Valongo

June 1st, 2010

Very useful articles! I found some broken links on my website! hehehe…

Mark

June 21st, 2010

Fantastic article,

Your instructions are so much easier to understand than GWT.

I’ll be following your 13 step process to the “T” – to fix several of my 404 error pages

Thanks again Omer…

I’m going to follow you on Twitter

Omer Greenwald

June 21st, 2010

Thanks for the compliments, Mark.

andrew

July 20th, 2010

what if the link isn’t broken? i mean the post does come out on the home page.. how is this possible?

Belote

July 21st, 2010

@Andrew I don’t really understand your problem ? What do you mean exactly ? Is this a problem ? Or is it just normal ???

Richard Chidike

August 15th, 2010

Thank God i am able to find this post. I have been searching for months on how to fix not found errors on my blog and this post has come to my aid.

Andres Lobo-Guerrero

September 25th, 2010

Thank you about the tip on Xenu. It was just the resource I was looking for. I’m looking forward to reading you post on 6 Critical WordPress plugins.

Doug S

October 25th, 2010

I normally don’t have time for this kind of thing. Thank you very much. The google method worked fine. You saved me time, cost and frustration with your information, Omar. You rock.

dselva

December 16th, 2010

Thanks for the post to remove the broken links.

Paul

January 3rd, 2011

I used to use Xenu but now use the Screaming Frog SEO spider http://www.screamingfrog.co.uk/seo-spider/ to detect broken links, redirects and all manner of SEO issues.

It’s similar to Xenu, but finds more information about SEO and the website.

Lalit

January 20th, 2011

Excellent post.. Loved it.. Thanks!

amith

February 13th, 2011

Last few days I have noticed that Google spider is crawling my site but not indexing any pages. So I checked using URLs from my site and found out the specific date from which Googlebot stopped indexing pages from my site. I checked all the posts I have published on that date and found out that on one post a given link returns a 404 error. I corrected that link in the post and noticed that my web pages are now appearing on the search result. I visited Webmaster tool and found some more old broken links on my site but that not affected Googlebot indexing pages.

Anyway I want to remove those errors and a thorough search on the web reached me here. Earlier I thought of writing an article regarding this but I dropped the idea when I saw your detailed article here. Regarding Xenu I have also tried it earlier but ended up in disappointment as it produce lot of time out messages.

Thanks.

eTipsLibrary

March 5th, 2011

I used Xenu. It works like a charm. Thanks for sharing.

Sandeep Singh

March 8th, 2011

Very informative post mate ! Broken Links can always be a big head ache , it is better to use plugins and remove these errors !

yash

March 11th, 2011

I have a classifieds site. So there are content/post that might get drafted after some time. Is there something that can be done so that the site is not penalized by google or other site engines. Also can we inform bots in advanced that remove certain post after specific days time.

gegejosper ceniza

April 24th, 2011

Thanks for the post… I am looking for this to solve my broken links problem in my site.

Praveen-rose

May 3rd, 2011

Hi,

I have a website that contain more then 20000 pages. So how to check the broken link of these pages. Is there any free tools to check entire websites broken link of my website. Please suggests.

Thanks

PR

Asigurari RCA

June 22nd, 2011

Excellent article, I’ve checked my site after I read this article and I found over 100 broken links. Thx again

Osho Garg

July 13th, 2011

Many Thanks For This Article :)

Fotoviaje

August 14th, 2011

Thanks for your post. Your article is very interesting. I dind’knew how find broken links, althought I have my page error 404 on my file .htaccess

shan ali

August 20th, 2011

i had some broken links but did not know what to do with them…thanx for the help….gr8 article

Naeem

August 27th, 2011

thanks for post i am really looking for it

rolanstein

September 6th, 2011

A couple of months ago I changed my permalinks from the default WordPress structure to a customised structure (/%post_id%/%postname%/), and since then my traffic has dropped off dramatically. I thought this was a temporary thing, and that I would regain my usual hit numbers as time went on and Google re-indexed my site pages. However, quite a few weeks after changing the permalinks structure, this hasn’t happened.

In trying to work out an explanation for my severe decline in traffic, I’ve recently been trying to analyse my blog performance by delving into Google Webmaster Tools, but cannot really understand much of the data I can see there. I wonder if anyone could help with this by responding to the following 4 questions?

1. Checking ‘Crawl Errors’, I have 206 404 errors. If I click on the first URL listed, it brings up a WordPress page stating:

“This is somewhat embarrassing, isn’t it? It seems we can’t find what you’re looking for. Perhaps searching, or one of the links below, can help.”

I have no idea what page is missing, or how to find out. Can anyone assist with an explanation, by any chance? Once I have one clear example, I should be able to apply it to other such URLs that have 404 errors (there are many like this one amongst the 206 URLs listed!).

2. The 2nd URL listed amongst the 404 error URLs is a page that was a sticky but that I have now taken off my blog, so I can see why it would bring up a 404 error. However, under the ‘Linked From Detected’ column there are 10 pages. I don’t understand why this now-absent sticky would be linked to from 10 pages (as is shown), and I don’t see why
any of those specific 10 URLs listed would link to that URL.
Those 10 URLs are all my blog posts, none of which have anything to do with the 404 error sticky URL I have taken down. I did not ever internally link from any of these posts to the now missing 404 sticky URL. Does anyone have a simple explanation for why these pages should be linked to the missing page, please? It just doesn’t make sense to me.

Sorry about these confused questions – there is obviously a lot I don’t understand about these crawl errors, but if, as I suspect, they are interfering with my blog posts being indexed properly, I really want to learn whatever is needed to address the issues involved and remedy them. I am terminologically and technically “challenged”, but can
figure things out if I have access to clear examples. So, would greatly appreciate some help if anyone is up to that!

Cheers

rolanstein

September 6th, 2011

Whoops – I meant “following 2 questions”, not 4. 2 is more than enough for now!

Marv

October 1st, 2011

I can’t seem to remove the crawl error and it doesn’t give me the option to. Will dead links be removed from google over time?

Sriram

October 7th, 2011

very useful. i was searching for this very helpful indeed.

Lindsey

October 12th, 2011

This was very helpful! Thank you! However, is there a way to delete multiple links at once?? I have to delete over 300 and doing that one-by-one is going to take a while…

Deadwin

October 13th, 2011

This is really helpful article i was getting lots of broken link but couldn’t find any such a nice tutorial.

David

October 25th, 2011

A really good article. I have followed your advice and have removed my broken links.

Samantha

October 26th, 2011

I have 55 broken links in total. Whilst I have read up how to remove them from google via GWT’s, I am having trouble fixing them. Someone please advise me further or point me in the right direction to fix them on my website. I’ve had enough of repeatedly paying webdesigners to fix the same problem.

Well appreciated!

M Kumar

October 31st, 2011

I have around 9000 inlink from good website, I don’t know the reason that how can these links formed, but status visible in webmaster, and around 600 crawl errors, i am not knowing the exact reason of my website rank dropping, Please any one guide me, what would i do for sort out this problem??

prakash

November 17th, 2011

nice article thanks for helping me to remove broken links

Rahul

November 19th, 2011

Very useful tool…4 out of 5…
thanks for posting

nischal

June 15th, 2012

Thank god i knew good information from here

Liliana

August 29th, 2012

Hi there! I just wish to give you a huge thumbs up for your
great information you have got right here on this post.
I’ll be coming back to your site for more soon.

Iola

August 29th, 2012

Wow, awesome weblog structure! How long have you ever been
running a blog for? you make running a blog glance easy.
The full look of your site is great, let alone the content
material!

Istiak Rayhan

October 8th, 2012

Very nice article …. I am using BLC plugin in WordPress to fix broken links.

Scott Grodberg

June 5th, 2013

God, can anything beat Xenu Link Sleuth? It’s easily the fastest and most specialized link checking tool there is.
But I’m stupid because I made another one anyway! Mine is web-based so it’s a little different (read “simpler”) and it’s called HTML Validator Pro. Come and use it up!

Bon

June 14th, 2013

Google WebMaster tool is great to find all the broken links on your site. The next step to fix it is to do 301 redirects

sampath lokuge

December 22nd, 2013

Really helped me.Thanks a lot :)

Leave a Comment

Subscribe to the comments on this article.