Footprints in a blog network are the obvious signals that the sites that link to you belong to the same person or persons. They could be spotted by your competitors, Google the machine or their manual reviewers.
Most importantly, they are signals that your links are not earnt but deliberately built to manipulate and improve your rankings.
Understanding what footprints matter
If we sit down to list all the things that could constitute a footprint, we’d be here all week. A scan of the blackhat forums throw up any number of crazy theories of what is a hazard for your site – some credible, some less so. But if you want to know how to build a private blog network safely it pays to know which you should be thinking about.
It makes life simpler if you break down the signals into two categories. There are those that would be detected by a human, and those found by a machine. Then, you can see which would be obvious or difficult for a human to detect, and which would be too complicated or expensive for a machine to determine. This could be due to the high number of false positives or the high processing power required to calculate.
For example:
Human detection
Easy to detect:
- Same theme (visually)
- Patterns in domain names (eg topforexblog.com, greatforexblog.com , idealforexblogcom )
- Obviously commercial links (anchor text, sitewides etc) or content
Harder (or more expensive) to detect
- If poor quality content is due to poor language skills or machine generation
- Patterns in IP addresses, ownership, CMS footprints etc
Algorithmic detection
Easy
- Patterns in IP addresses
- Patterns in WHOIS data or registration dates
- Using the same theme or CMS (HTML analysis)
- Unnatural patterns on site (high numbers of redirects or 404s)
- Lightweight content analysis: Duplicate text etc
Harder / more expensive
- Change of site topic or concept – this is likely to be calculated periodically, especially when a domain drops or changes ownership
- Visually similar sites using screenshots and image analysis. Google has plenty of image processing power. But this is still expensive when multiplied across billions of sites with hundreds of backlinks each
- Deeper content analysis. For example, identifying repetitive themes, concepts and topics within a network vs the human generated echo chamber that exists in the blogosphere. (for examle ’10 Amazing Home Decorating Tips’ vs ‘7 Awesome Tips For Home Décor’)
Many of these footprints are more likely to raise a flag than an immediate penalty. Too many flags lead to further investigation – manual or algorithmic – rather than an instant ban.
Our philosophy on footprints is simple. It’s impossible not to leave them – unless you can fly, of course. Instead you should focus on blending into your surroundings rather than trying to be 100% whiter than white.
CMS / Theme footprints
For instance – is hosting all your websites on Wordpress a flag? Possibly, if every link you ever built was from a Wordpress blog. But given Wordpress is one of the most popular CMS on the planet, it could be more of a worry if you didn’t have wordpress sites in your backlinks. Many more sites still are unidentifiable or pure HTML.
But if you build every one of your 500 network sites using an obscure CMS from 1996 you might find it stands out a little.. though whether Google ever write a routine to check for that footprint is a moot point.
Likewise themes can be an obvious sticking point. You don’t want the same theme on every site, that’s for sure. But here’s a question for you to consider. Is it more natural to have 100% unique themes in your network, or to make use of common frameworks such as Twitter Bootstrap? Bootstrap already powers over 7.3 Million websites. As ever it is important to strike a balance - it's easy to achieve paralysis if you over analyse. Just try and weigh the most important factors to avoid getting caught in the noise.
DNS Records: Registrars, Ownership & Registration dates
Is keeping all your whois details the same a red flag? Yes, definite danger zone.
But what about hosting them all at one registrar? The answer is it depends. If all your domains are with GoDaddy – like 33% of the web’s domain names – chances are you will blend in, up to a point.
If you move every single one of your domains to a smaller registrar it could be a pretty big signal that the owner of all the domains are the same.
The same can be said of name servers: if all your name servers are the same it is a signal. But if you use nameservers provided by a a major registrar such as Namecheap, Godaddy or Network Solutions it is not a terrible one.
You will blend in much more easily than multiple domains being served by ns1.somesmallwebhost.com . Why? Because these registrars power the DNS for huge chunks of the internet. Again it would be stranger if 100% of your links were not coming from domains using nameservers from Godaddy than if 100% were.
Other signals or footprints you should be aware of are:
- All your domains using private registration
- All your domains registered in the same month
- All your domains being the same extension
The big question is how much do these matter? It’s a question of scale and your existing link profile.
If you have 200 natural links and you add ten network domains, all bought in June 2015 it may be a small signal, but nothing major.
If you have no natural links and buy 100 PBN domains in one month then that is far more likely to trigger a flag of some sort.
Similarly, if 80-90% of your linking domains were .coms – or the relevant country domain for your site (uk/de/fr etc) it is unlikely to be an issue. But if you built your network from 100% .xyz domains you may be operating at a higher level of risk.
Hosting and IP Diversity
On hosting – is it madness to host all your blogs on the same IP? Yes, absolutely! Hosting multiple blogs on the same class C ip address (eg 123.255.123.1 and 123.255.123.75) ? Also a total giveaway.
Is it crazy to host all your domains with the same cheap hosts everyone else uses? Probably – we’d ask at which point does a ‘cheap web host’ stop being unique and become an SEO host in all but name?
Usually it is when everyone else is following the same tactic. We’ve seen that even if you find a virgin host untouched by SEOs, after a year there’s likely to be many more low quality SEO blogs sharing your IPs.
Is hosting all your domains using a cloud based CDN system a footprint? Possibly. But when over 6,000,000 sites are sharing a pool of over 350,000 dynamic IPs the picture is less clear. And as more high quality, well run sites switch to CDN systems there is more and more cover for a handful of blogs.
- If a cheap shared server hosts 500 websites, of which 50 are PBN sites then 10% of the IP’s websites are under suspicion. Even if the PBNs are all owned by different SEOs!
- 5,000 blogs on a CDN’s IP range shared by 6 million domains is still just 00.05% - 2000x less than the shared server! This isn’t even a blip on the radar
We believe as the quality of SEO hosts and cheap hosts from WebHostingTalk deteriorates SEOs will be left with two options:
- Build your own network of reseller accounts across a higher quality of web hosts.
This is likely to be an expensive solution for all but the biggest link sellers and SEO agencies - Move to the cloud
IP Reputation: Good and Bad Neighbourhoods
There is one last point on hosts we’d like you to think about that is rarely mentioned in the SEO world. Whether an IP address is shared with another PBN site or not your hosting matters.
There are two clear patterns that you don’t need to be a genius to see:
- In general, bad websites are hosted at cheaper/crappier web hosting companies
- In general, great websites are hosted with great hosts. Rackspace, LiNode, Digital Ocean, Amazon AWS, CloudFlare and Cloudfront are obvious examples.
If you were hiding from the law, where would you base yourself? In the ghetto with frequent police stops and shakedowns? Or in an upstanding, smart neighbourhood where good behaviour is the norm? We know where we’d prefer to host our sites!
But hosts like Rackspace are expensive! That’s why we built our hosting network around CDN systems such as Amazon’s immense web infrastructure. Sharing a space with some of the biggest and best tech companies, start-ups and brands in the world gives you a great place to hide out. And the scale of services such as Akamai, Rackspace, Amazon and OnApp mean we can pass on cost savings that beat the competition hands down.
Inbound links and interlinking
One of the easiest footprints to spot in a blog network is a pattern of interlinking between PBN sites.
Sites on the same theme or topic do occasionally link to each other. But it is extremely difficult to replicate a natural linking pattern on your own blog network. Most topics have an enormous number of websites devoted to them. Repetitive linking between a small subset of those sites sticks out like a sore thumb.
Once you link more than two or three sites together clear patterns emerge which make it easy to assume a common owner for each of the blogs. Combine this with common outbound linking patterns and it becomes conclusive proof of manipulation.
Similarly, for those building their own pumper sites on fresh domains, it is wise to avoid re-using the same link sources to build your network.
Each link they share becomes a pattern, so reusing cheap directories, paid link sources or even the same set of web 2.0 sites or profiles is a flag that the sites share a common owner.
It’s also wise to avoid domains with highly spammy links, which we will cover in the following section.
Outbound links and patterns
Linking out to the same sites again and again
Our last major footprint is one of the biggest: repetitive outbound links from one small set of sites to another.
The economics of a link network means that each target you link from your domains will reduce your total cost per link. Bargain!
The problem is, every outbound link to a common target increases the probability that both network sites and target are part of a network.
How so? Sheer numbers. There are billions of sites on the web. Perhaps tens or hundreds of thousands within your niche – depending on whether you call you topic ‘health’ or ‘skin conditions’.
The chance of two sites in the niche linking to a single target is fairly high. But with each additional site you add the probability shrinks dramatically.
Say you have a network of 100 sites and one money site. So far, so good – there’s no real reason to tie the sites together.
But if you run 2 sites in the same niche, and link all 100 sites to the same 2 sites, the footprint becomes far greater. How about 10 sites, each and every one with links from the same network of 100 domains? Then you’re really asking for trouble.
So what’s the solution? You need to find a way to balance the risk of creating an obvious network footprint with your own financial constraints.
Different people will have different tolerances for risk. You might be more aggressive than us. But we would recommend avoiding linking to the same sites from more than 20-30% of your network at a time.
You can also prioritise – if you have one authority site in the niche perhaps you need to link from all 100 sites. But you will also want to protect that site. So the other minisites you run in the same niche may only deserve 10 or 20 links from the same network to minimise the overlap.
Using the same types of links
In the real world, not every site links in the same way every time. Links may be found in the content of an article, or in a ‘Related Links’, ‘Blogroll’ or ‘Footnotes’ section in the sidebar or footer.
The anchors can vary - they could be a branded link, a long sentence about hosting your private network that contains partial match keywords, a (source) link, an image tag, a pure url or even an empty, broken or truncated a href tag.
Only very rarely will they be a blatant exact match anchor.
You should aim to replicate these patterns. We want to use our networks for exact anchors we can’t achieve elsewhere – but try to avoid using the same exact anchors throughout your network.
Target inner pages as well as your homepage and category pages. It's also great to target link bait’ you may have created such as infographics, downloads, giveaways and contests.
Other major outbound linking footprints:
- Linking exclusively to commercial sites
Run blogs on fishing? Even if you’re reviewing fishing charters it would be unusual not to link to marine associations, lifeboat charities and other blogs with shared interests. - Linking out to a random set of commercial sites
What do 'VPS hosting', 'e-cigarette reviews', 'carpet cleaners San Francisco' and 'dui attorney Las Vegas' have in common? Nothing - except for a gigantic footprint, that is!
Keep your sites niched as much as possible. Even if this is broad concepts such as ‘health’ rather than ‘weight loss’ it’s a hell of a lot better than the alternative - Linking out exclusively low-qualityity sites
We’re not suggesting your own sites are low quality, of course! But if your network represents the best links your sites have it can stand out like a sore thumb. Link to other authority sites in your niche to avoid obvious patterns.