Can someone help with a 410 redirect in the htaccess file?

TemplateRepo · 13 March 2025 11:49

Google is attempting to crawl a URL that no longer exists, and will never exist. The URL is …

https://www.domain-name.com/repo/archive/

My thinking is the following in the htaccess file should sort it…

RewriteEngine On
RewriteRule ^/repo/archive/$ - [G=410]

However, I’m pretty sure I’m getting the path for the rewrite rule wrong. Or, perhaps, I’m getting the whole thing wrong!

Obviously, I can’t mess this up, as it’s got the potential to seriously mess with search results.

Can anyone help?

dave · 13 March 2025 13:54

Either of the following should work perfectly:

Redirect 410 /repo/archive/

RewriteEngine On
RewriteRule ^repo/archive/?$ - [G,L]

TemplateRepo · 13 March 2025 13:56

Awesome, thanks Dave.

One more thing…

If I put that in, once Google crawls the site again, it’ll get the 410 code and then stop trying to index everything in the folder /archive/ ?

Is my understanding of how this works correct?

Thanks.

dave · 13 March 2025 14:17

Your understanding is mostly correct—a 410 status tells Google the pages are permanently gone, and over time they should drop the URLs from the index. However, to ensure the process works smoothly, consider these points:

• Ensure All URLs Return 410: Make sure every URL in the /archive/ folder consistently returns a 410 status code.

• Avoid Blocking via Robots.txt: Don’t disallow the folder in robots.txt; if Google can’t crawl the pages, it won’t see the 410 status and may keep the URLs in its index.

• Use Google Search Console: If you need faster removal, use the URL removal tool in Google Search Console to expedite the de-indexing process.

TemplateRepo · 13 March 2025 14:45

Point 3 is a good one. There is a temp removal tool, but from what I understand it’s only good for about 6 months. But, adding the URL to that tool and doing the rewrite, should see a faster result. So great call, thanks.

Point 1 is a toughy. The folder /archive/ contains hundreds of markdown files that each have loads of various links, all to a file system that no longer exists. Setting up unique instructions for each URL is not feasible.

Is there a way to setup a “catch-all” instruction in the Htaccess?

dave · 13 March 2025 15:19

Hey Steve…

Yes, you can use a catch-all rule in your .htaccess file. For example, using mod_alias you can add this line to return a 410 for any URL under the /archive/ folder:

RedirectMatch 410 ^/archive/.*$

Alternatively, if you’re using mod_rewrite, you can add:

RewriteEngine On
RewriteRule ^archive/.*$ - [G,L]

Remember, ensure that your robots.txt isn’t blocking these URLs so search engines can see the 410 responses.

TemplateRepo · 14 March 2025 10:05

Thanks so much again for the replies, dave, but things are starting to get a little over my head!

If I’m understanding correctly, if I want to divert all requests for /repo/archive/ and all it’s sib folders/files, do I just use …

RewriteEngine On
RewriteRule ^archive/.*$ - [G,L]

Or, do I use a combination of both, ie…

RewriteEngine On
RewriteRule ^repo/archive/?$ - [G,L]
RewriteRule ^archive/.*$ - [G,L]

Thanks.

dave · 14 March 2025 11:36

Hi Steve,

Since you want to block access to only the /repo/archive/ folder and its subdirectories, and you’re placing your .htaccessfile at the document root, you can use a single rewrite rule that targets that specific path. For example:

RewriteEngine On
RewriteRule ^repo/archive/.*$ - [G,L]

• HTTP Response:

The [G] flag tells Apache to return a 410 Gone status. This effectively informs browsers and search engines that the content is permanently unavailable, rather than simply preventing crawling.

.htaccess Location:
Since the .htaccess file is at the root, the rule must include repo/ in the pattern to correctly reference the full URL path.

Regarding Robots.txt:
It’s important not to rely on a robots.txt file for this purpose. While robots.txt can prevent search engines from crawling certain directories, it doesn’t stop them from indexing the URLs if they’re linked elsewhere. With the .htaccess rule returning a 410 status, search engines will recognize that the content is permanently removed and are more likely to drop those URLs from their index.

This should do the trick; give it a couple of weeks and let me know if you’re still seeing issues. BTW; Google just began a new core update yesterday…so, it may be a few days before they get things sorted (ha; not that I’m counting on them ever getting search sorted again).

TemplateRepo · 14 March 2025 13:20

Many thanks. Much appreciated.