FREE TRAINING: WRITE A BETTER WEBSITE
Nine lessons to help you write a website your customers want to read
Nine lessons to help you write a website your customers want to read
Last updated by Jessica Barnaby
A word-for-word template to regain your privacy from the Wayback Machine
The Internet Archive Wayback Machine has been trawling the internet since 1996 caching snapshots of webpages, even entire websites, and holding them in a virtual museum at https://web.archive.org.
Here’s a snapshot it took of the Twitter homepage on 30th June 2007. The black bar along the top of the image shows the number of snapshots it’s taken of that page over the years.
It’s a wonderful piece of history and the Internet Archive is free. You can go back and see how websites have changed over time.
The Wayback machine snapshots individual pages too. Here’s Ashton Kutcher’s twitter page as at 3rd March 2009.
This is the part I don’t like much. Say you had a twitter account and then a few years down the line, decided to make your account private. Random people can’t see your old posts and this is what you want.
But the wayback machine has saved your posts from years gone by and anyone can find them. The Internet Archive only trawls public pages and when it took a snapshot of your twitter page, it was before you made it private. So it hasn’t done anything wrong.
The Internet Archive doesn’t need your permission to take a snapshot of a public page and the twitter terms and conditions that you agreed to undoubtedly have this covered.
Basically, you use social media knowing your posts might live on forever somewhere — as a quote in a newspaper article, on a stalker’s hard drive or in a publically accessible internet library. It’s a risk we accept when we use social media.
A business site I created in 1998 lives in the Internet Archive Wayback museum. It was nostalgic but also spooky to see it sitting there fully functional because it’s been two decades since I killed it. But every now and again, I’ll look it up and remember that special era when the internet was brand new and I’ll think about all the clients I made logos and websites for.
A blog I used to write when I was an expat in the Middle East lives in the Wayback museum too. I killed it when I returned to the UK and sometimes wish I’d kept it live even though I don’t want anyone to read it anymore. So it soothes my soul that those expat memories are archived in a museum at an address only I know.
But there are many reasons you might not want the history of your website to live on forever in the digital library of the internet archive:
I decided I didn’t want this website, Wednesday Genius, in the Internet Archive. If you search for it, you’ll see this:
Before I tell you what I did to have my website removed, I want to point out there are advantages in keeping your site in the wayback machine:
But if you’re sure you don’t want all previous versions of your site archived by the Wayback Machine, read on.
Removing your website from the Internet Archive and keeping it out is a two step process. Here’s what you do:
Find your website on https://web.archive.org. You’ll be customising the following email with your details.
Once you’ve customised the email, send it to info@archive.org with the subject “DMCA Take Down Notice”
Email template: Customise the bold text with your own details
Hello
I am the owner of domain name and website “yourwebsiteaddress.com”
I request you to remove the following links from your website
http://web.archive.org/web/2018*/yourwebsiteaddress.com
http://web.archive.org/web/2019*/yourwebsiteaddress.com
1) Note what I’m doing with the year. To make it easy for them, I’m listing out the URL for each year for which they are holding data.
2) You can see the years easily by looking at the black bar on top of the page. Every year that has black columns in it is a year for which they have taken a snapshot.
My Address:
Add your address here as it appears on your domain records
Phone No:
Add your phone number as it appears on your domain records
Email Address :
Add your email address. Make it easy for them by making it an email associated with your domain eg name@yourwebsiteaddress.com
I have a good-faith belief that the disputed use is not authorized by the copyright owner, its agent, or the law.
The above information in this notice is accurate, and under penalty of perjury, I am the owner of the copyright interest involved.
Kind regards,
Your Name
Now you need to tell the Wayback Machine robot that you don’t want it looking at your site in future.
The robots file is a .txt document that you add to the root of your domain.
Open notepad or any text editor and write this:
User-agent: ia_archiver
Disallow: /
Save the document as robots.txt and upload this file to the root directory of your site. For example, if your domain is www.mywebsite.com, you will place the file at www.mywebsite.com/robots.txt.
It took them a week to confirm my website was being submitted for exclusion and when I checked two days later, it was gone.
The Internet Archive is a useful resource and I enjoy using it. As with all things, moderation is key. Everything you do doesn’t have to exist for public consumption at all times.
You’ll also be subscribed to the Wednesday Genius newsletter with tips, articles and offers to help grow your business. Sometimes you’ll get homework too.
You have successfully joined our subscriber list.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.