A word-for-word template to regain your privacy

The Internet Archive Wayback Machine has been trawling the internet since 1996 caching snapshots of webpages, even entire websites, and holding them in a virtual museum at https://web.archive.org.
Here’s a snapshot it took of the Twitter homepage on 30th June 2007. The black bar along the top of the image shows the number of snapshots it’s taken of that page over the years.

It’s a wonderful piece of history and it’s all free. You can go back and see how websites have changed over time.
The Wayback machine snapshots individual pages too. Here’s Ashton Kutcher’s twitter page as at 3rd March 2009.

And here’s the problem
Most places allow you to delete what you’ve written. But most people don’t even know that the Wayback Machine crawls social spaces to add what it finds to the Internet Archive and you don’t have any control over that.
If you delete a Twitter post or mark your account as Private, people can still find your old material if the Wayback machine crawled it while the post was still live and public.
The Internet Archive doesn’t need your permission to take a snapshot of a public page and the twitter terms and conditions that you agreed to undoubtedly have this covered.
Basically, you use social media knowing your posts might live on forever somewhere — as a quote in a newspaper article, a screenshot… or in a publicly accessible internet archive. It’s a risk we accept when we use social media.
But what about normal web pages? Personal blogs and business sites? Isn’t it a problem that so many users don’t know the Internet Archive could be archiving their material forever?
Good things happen in the Internet Archive
The people at the Internet Archive believe they’re performing a great public service. To an extent, they’re not wrong.
- The Internet Archive is a piece of your history and it’s so much fun to go through old versions of your site after several years. Or to look at other websites and see how they changed over the years.
- You can use the Internet Archive to restore information you delete by mistake (taking your own backups is better because there’s no guarantee the Wayback Machine will have made it to your site.)
- You can reference items in the Internet Archive as part of your portfolio once you’ve moved on to a different phase in your life, especially useful if the business isn’t online anymore.
Capture your business history
A business site I created in 1998 remains fully functional in the Wayback Internet Archive even though it’s been dead in reality for over two decades. Every now and again, I’ll pay a visit to that special era when the internet was brand new and I was telling disbelieving clients that every business would have a web address on their vans within five years.
Recover accidental deletions
I had a popular blog when I was an expat in the Middle East. It’s one of my biggest regrets that I deleted it when I returned to the UK because even though I don’t want anyone to read it anymore, I miss the blog and the memories. So it soothes my soul that those expat adventures are archived in the museum of Internet Archive at an address only I know.
Prove ownership when someone steals your website
I discovered someone stole an old employers website. Because there was a history of my employers website on the Internet Archive, I was able to get the thief’s website shut down in around three hours of discovering the theft.
But sometimes you want to regain control
There are many reasons you might not want the history of your website to live on forever in the digital library of the internet archive:
- There’s private information you no longer want in the public domain
- You’re selling the website and don’t want to be associated with the new owner
- You purchased a domain and don’t want to be associated with whatever business owned it before you
- You purchased a domain and you don’t want people to know how much you brought it for (the wayback machine captures redirected sedo splash pages)
- You don’t want people to see how your website changes over time
I’m good with taking regular backups and decided I didn’t want my website, Wednesday Genius, in the Internet Archive. If you search for it, you’ll see this:

The domain is special to me and it’s been on an evolving journey that I don’t feel the need to share on a public archive.
I know many people feel the same because this article was originally published on Medium in 2020. It’s had over 36,000 readers and a ton of people have emailed me for help with their own specific situation and also to offer a tip by way of thanks. Thank you, dear Readers!
If you’re sure you don’t want all previous versions of your site archived by the Wayback Machine, read on.
The Template to Remove Your Website From the Internet Archive
Removing your website from the Internet Archive and keeping it out is a two step process. Here’s what you do:
Step One: Send This Email
Send this email to info@archive.org with the subject “DMCA Take Down Notice”
The sections in BOLD are the bits you need to customise with your own details.
Hello
I am the owner of domain name and website “yourwebsiteaddress.com”
I request you to remove the following links from your website
http://web.archive.org/web/2018*/yourwebsiteaddress.comhttp://web.archive.org/web/2019*/yourwebsiteaddress.com
- NOTE what I’m doing with the year. To make it easy for them, I’m listing out the URL for each year for which they are holding data.
- You can see the years easily by looking at the black bar on top of the page. Every year that has black columns in it is a year for which they have taken a snapshot.
My Address:
Add your address here as it appears on your domain records
Phone No:
Add your phone number as it appears on your domain records
Email Address :
Add your email address.
1) Make it easy for them by making it an email associated with your domain eg name@yourwebsiteaddress.com
2) I’ve never tried it with a gmail address and expect you’d have to prove ownership another way
I have a good-faith belief that the disputed use is not authorized by the copyright owner, its agent, or the law.
The above information in this notice is accurate, and under penalty of perjury, I am the owner of the copyright interest involved.Kind regards,
Your Name
Step Two: Amend Your Robots.txt file
Now you need to tell the Wayback Machine robot that you don’t want it looking at your site in future.
Robots files are generally respected by crawlers, so add this to your robots file:
User-agent: ia_archiver
Disallow: /
The robots file is a .txt document that you add to the root of your domain.
- Open notepad and write the above two lines
- Save the document as robots.txt
- Upload this robots.txt file to the root directory of your site. For example, if your domain is www.mywebsite.com, you will place the file at www.mywebsite.com/robots.txt.
It took them a week to confirm my website was being submitted for exclusion and when I checked two days later, it was gone.
The Internet Archive is a useful resource and I enjoy using it. As with all things, moderation is key. Everything you do doesn’t have to exist for public consumption at all times.
Leave a Reply