Someone stole my client’s old website and I shut them down
When you put something online, there’s always a risk it will be stolen by some lazy deceitful chancer looking to profit from your work.
The injustice of having your work stolen makes you feel violated and helpless but you don’t need to let these feelings roost because many web-hosts and social-media companies have established procedures to deal with website scraping and copyright infringement happening on their servers.
It recently happened to me and I got the thief’s website shut down in around three hours of discovering the theft.
The most important thing you need is proof you own the copyright. With online content theft, proof can be the URL of the original article, preferably date-stamped.
Some sites disable date-stamps for various reasons. Sometimes it’s to hide that a blog isn’t regularly updated. Other times it’s because articles are evergreen and don’t need a date dating them. If this is you, you can often find your original publication date in the source code, site-map, or even the google search results.
But what do you do if your original work is no longer online? How can you prove there’s been a case of copyright infringement?
Earlier this month, I discovered a long-standing client had had his entire website scraped by one of his competitors. Let’s call my client Henry (not his real name) and the competitor, Dastardly (which may well be his real name.)
Dastardly stole Henry’s entire website, changed the logo and contact details, and passed the site off as his own. I came across it by chance and recognised it instantly because the website had been completely designed, written, coded and illustrated by me. All the images still had Henry’s logo watermarked on them.
Now here’s the problem. Henry’s site with all my content is no longer online.
Henry started to wind down his business in 2017 and closed the doors in 2018 to take retirement. He left his website running for nostalgic reasons until it went offline automatically when his hosting and domain expired later that year.
Dastardly scraped Henry’s website in 2017. Henry didn’t notice. He’s in retirement and no longer in contact with his peers. He’s got nobody monitoring Google for anything to do with the business because nobody monitors for ghosts.
Google stopped indexing Henry’s website when it went offline in 2018. Their web crawlers can’t find any of the pages so over the next few months, Google removes these pages from its search results, until one day all that’s left of Henry’s domain are dead links from online directories that never update their content.
The thing is, when Dastardly scraped Henry’s website, he didn’t just steal a bunch of random words on a page, he stole instant credibility and positioning.
Henry had paid me handsomely to design a highly-optimised, highly-targeted website that would “blow his competitors out of Google.” He wanted the site to be rich in his values and shaped to inspire trust. He gave me free rein and I gave him a site that was constantly in the top 3 results for all of his industry keywords. It appeared on the results page with sitelinks and came top in GoogleMaps for local searches.
And guess who was benefitting from all this optimization now?
Yes, by stealing the entire website, Dastardly now had all these highly-optimized pages sitting on his domain ready to reap glorious rewards. Henry’s site is gone, no longer indexed, so Google uses all my optimization to take Dastardly to the top of the results page, which is where I find him sitting pretty on January 6th, 2021.
I speak to Henry, just in case Henry sold the rights and I’m wrong about the copyright infringement. But Henry’s disgusted too. He says he would have sold the site to Dastardly if approached but of course, if someone’s going to steal, they’re not going to buy.
We go through the available options. He’s not looking for compensation in the first instance and asks me to handle the matter. He says he’ll bring in a lawyer if things don’t work out.
There are two things I want to achieve:
- Report the copyright infringement to Google and have them remove Dastardly’s links from Google search results
- Have Dastardly remove Henry’s content from his site
Okay. But what about the problem of Henry’s website no longer being online. If it’s not indexed anymore, how can I prove there’s been a copyright infringement?
The Internet Archive Wayback Machine is the first port of call. They send robots out every day to capture websites and store them as part of internet history.
I found they had 163 captures of Henry’s website over a period of seven years, so all I had to do was decide which of these captures to use. I chose one from 2015 because it easily proved the copy existed long before Dastardly used it.
If you’ve removed your site from the Internet Archive, or if your site hasn’t been captured, your unique situation would determine what you do next — exploring these options is beyond the scope of this article.
Report a Copyright Infringement to Google
Log into your Google account and visit their reporting page. My complaint relates to their search engine, so that’s what I picked.
Then you have three boxes to complete:
Box 1: Write your complaint here. Keep your report short and sweet because they only give you 500 characters to explain what the problem is.
Box 2: Provide URLs to the legitimate material on your own website or on the Internet Archive.
Box 3: Provide URLs to the stolen content on the thief’s website
Then sit back and wait for Google to investigate.
Have Your Content Removed From the Thief’s Website
Since I didn’t want to contact Dastardly directly, I emailed his web host instead. My email was a simple couple of paragraphs explaining the copyright infringement and providing links to Henry’s website on the Internet Archive to prove ownership.
Just forty minutes later, the web host emailed to say they’d suspended Dastardly’s account and given him a warning.
I checked and his site was dead. Job done.
The next day, the host emailed me to confirm Dastardly had reset his account and wiped all the data. I’m assuming this means he’s proceeding with a new installation of WordPress.
Google took three days to investigate my complaint but found they had nothing to do because Dastardly’s web host had already taken the site down so quickly and all the pages were deleted.
I must give full kudos to Dastardly’s web host. I had not expected them to take action so quickly.
Eight of my images are still appearing in a web search for Dastardly and it’ll probably take a few weeks for them to roll out of the Google cache. If they haven’t gone by March, I’ll submit another request to Google.
From Shock to Solved in 3 Hours
The shock of finding someone has stolen your entire website is harsh. But I hope I’ve shown you that there are established procedures in place to address copyright infringement.
I discovered the stolen content around 2pm. Then it was shock, chat with Henry, list the URLs of the stolen content on Dastardly’s site, find the URLs of the original work on the Internet Archive, complete the Google form, and email Dastardly’s webhost.
By 5:30pm, Dastardly’s website was suspended.
As you can see, the hardest part is actually discovering your content has been stolen. If your business is live, set up Google Alerts for your business name and unique sentences so you get told when it comes up. If your site isn’t live anymore, then it’s a judgment call as to whether or not to keep monitoring it. Alerts don’t cost anything, so perhaps keep them active for a year or two.
Also keep your own digital records of content, purchased stock images etc., so you’re not reliant on external archives.
I’m keeping my eyes on Dastardly. His site has been down for two weeks now and he’s got a “Coming Soon” page up. I’m intrigued to know if he writes the next one himself or if he’s found another website to scrape.