Back
GUIDES
Posted on Nov 9, 2021

How to Identify and Eliminate Traffic bots?

Traffic bots are accountable for roughly 40% of all internet traffic. However, not all bot traffic is bad (e.g there are search engine crawlers). Some bot traffic is helpful, – SEO tools, copyright bots, and search engine bots (web crawlers). But some bots are, unfortunately, bad news  — making higher server costs, ruining your analytics data, reports, and mood at the end of the day. 

Bad bots are evolving hand in hand with the development of anti-fraud tools, and the losses in ad budgets will continue to grow. Still, a number of genius anti bot traffic tools help precisely identify and eventually minimize abusive bot traffic.

Anatomy of bot traffic: from definition to types

What is bot traffic? 

So, what does the term bot traffic stand for exactly? In simple words, it is all non-human traffic, meaning that various activities online are performed by automated computer program/software. 

Even though the words “bot traffic” revive Terminator and Megatron in mind, the non-human traffic doesn’t necessarily have to have a negative connotation. In fact, there is plenty of good bot traffic, as search engine crawler bots, and without them, the internet wouldn’t be as user-friendly as it is today. So, let’s sort out the types of bots and how we can learn bot filtering.

The good bots

The task of good bots is mainly to collect the information about the site on the internet to make the WWW a better place for all users. The good bots come in the forms of SEO/search engine bots (crawlers), monitoring bots, digital assistants, and other useful services that scan websites for copyright compliance and detect questionable activities. All this influences sites’ ranking and eventually affects what will be on the first web pages of your Google search results (or other search engines).

The bad bots

The malicious bots are the ones that intentionally harm performance for their developers’ profit. Names of bad bots may vary from source to source, be it click fraud or DoS, but you can always identify bot traffic by intentions:

Intention to mimic real users – these bots are often used in DDoS attacks (distributed denial of service). When a group of devices infected with malware connect to a server/network to slow the website’s performance (by bots browsing the site for a long while and at an unusually slow rate) or make it unavailable to legitimate users. 

They may also be called imposter bots, as they pretend to be genuine visitors. Impersonating bots account for the majority of all bad bot traffic.

Consequences:

These kinds of abusive bot traffic hurt analytics data/reports and influence page views, session duration, and bounce rate, often just on a single page.

New businesses are often tempted to buy at $2/1000 users when starting off to increase credibility in the eyes of the real visitors. While the idea is appealing at the beginning, in the long run, bot traffic affects organic visits, and the consequences are hard to “rewind.”

Intention to mimic real engagement – these are usually called spam bots (targeting other website URL). Their task is to post inappropriate comments on social media, website reviews, fill in contact forms with fake information, including phone numbers (aka form filling bots), write phishing emails with links to fraudulent websites, imitate page views, and so on.

Consequences:

This type of bot traffic also results in false analytics metrics and reports, discrepancies, poor organic engagement, a higher bounce rate, and an awkward social media presence.

Intention to mimic real clicks – there are a variety of ways a click fraud can occur (spamming/injection) through malware that will trigger fake ad clicks, for example, in PPC ads (pay per click ads). Therefore, these bots click and make PPC ads campaigns hard to run profitable advertisements. This is one of the most popular malicious bots.  

A subcategory of the action bots are the inventory hoarding bots. These are the ones that aim to spoil the e-commerce stats and performance by putting items in the cart making it unavailable to legitimate users. These inventory hoarding bots are not as common as click fraud but should be kept in mind nevertheless. 

Consequences:

Fake bot clicks lead to surprisingly high CTR and obviously low and junk conversions, resulting in wasting advertisers’ budgets. Fake ad clicks also lead to inaccurate analytics and can mess with the developer’s A/B testing.

Intention to mimic real downloads/installs – these are the automated systems that perform downloads or installs. Third parties use them as a part of DoS (denial of service) attacks to slow down or halt the performance of an app/website. 

However, these bots can also be used by the website owners to embellish the real download/install figures to make the product more appealing for real users (say, in Google Play or an App Store). 

Consequences:

These types of bots also lead to faulty statistics and can affect apps’ position in mobile stores. 

Intention to steal data/content – there is a variety of things these bots can do. Impersonation or domain spoofing is of the main tactics when, for example, the malware injects different ads inside the traffic to a website without site owners noticing and then collects the revenue. These bots can also crawl the search results, look for personal data, IP addresses, steal content, and use it for parsing (to make fake websites listed in search results better), mimic human behavior. 

Consequences:

Malicious bots have many consequences – from fraudulent traffic in Google slowing down the access of human traffic to a website to loss in ad revenue. 

How to detect bot traffic?

Now that we have identified the types of bots, realized their potential and practical danger, let’s learn the best ways of detecting and blocking bots. Not by using search engines, of course. Knowing the enemy is step one. Step two: identify bot traffic. 

To get a full view of the potential danger, you would have to dive into your ad analytics. All major deviations and suspicious bot activity in analytical reports should be examined. According to Google Analytics and other sources, the most telling signs are:

  • Sudden and inexplicable increase in analytics metrics (from visits and abnormally high bounce rate to extremely long session duration on a single page),
  • Abnormally low or abnormally high pageviews,
  • Sudden problem with providing traffic to a website, its speed, and performance,
  • Suspicious site lists, unexpected locations, data centers, IP addresses (including referral traffic in Google, for example).

There is a number of analytical instruments out there that can help identify and advise for simple Google searches like “bad bot traffic how to stop”: from more evident and general like getting an aforementioned Google Analytics account to others, taught not only to detect bot traffic but also to identify whether it is a spam bot, a good bot or a real human user. 

 To fight off basic bad bot traffic, publishers can:

  • Use device identification
  • Use CAPTCHA/reCAPTCHA 
  • Disallow access from suspicious data centers 
  • Use Account protection
  • Use CDN (content delivery network) is a good solution against basic and moderately smart bots, including DDoS attacks
  • Install a robots.txt file. It is kind of a roadmap for bots (good and bad) of where they could/should access on your site 
  • Use rate limiting solutions. These are tools that monitor a number of users on a given website using IP tracking methods. Rate limiting will not stop bot traffic once and for all but monitor and detect sudden spikes in user activity from one IP address.

Essentially, use basic knowledge from your digital marketing experience. 

In some systems, like Google Analytics, for example, web traffic coming from known bots and spiders is automatically excluded. Now, not all systems provide a wall of obstacles for bad bot traffic. And as we will learn in the following sections, not all bot traffic can be fought off by captcha or Google Analytics filters. Traffic bots are getting smarter by the day and are here not only to mess up our bounce rate and page views. 

Global bot traffic statistics

From the birth of the first of the web robots in June 1993, whose sole task was to measure the size of the internet until today — a lot has changed.

Good bots, bad bots, humans bots

According to Imperva, in 2019, little over 37% of all internet traffic was bots. Out of them, 13% were the good bots, and 24% were the bad bots.Even though the total number of non-human traffic has been dropping over the years, their ratio has changed. Back in 2014, a prevailing number of bots were the good ones: crawling Google or other search engines, kindly measuring the “average temperature” across the internet. Today the situation has turned upside down.

The rise of human traffic versus bot traffic

The amount of malicious bot traffic is slowly declining, the budget spent on digital advertising meanwhile is constantly growing. According to eMarketer, the global ad budget for digital advertising equaled roughly $135 billion in 2014, and in 2020, it amounted to $378 billion.

With that, $35 billion were spent on ad fraud in 2020. And it is estimated by the World Federation of Advertisers that by 2025 the number will reach $50 billion!

When we think about bad bot traffic, we imagine other sites and often think that bots probably will have no interest in attacking our resources. False. Bots will attack any vulnerable place they can find.

Traffic bots in different industries

While many businesses have a common traffic bot problem, some sophisticated bots target particular industries.

Sophisticated programmatic bot traffic and how SmartHub can help fight it

Now that we have looked at all the general information, it is time to dive deeper into the theme of ad fraud. It is necessary for revealing more complex issues and ways to manage bot traffic and combat bot traffic. 

The Sophisticated bots

We cannot stop bot traffic using Google Analytics or tools like CAPTCHA only because not all bots are simple and programmed to perform basic commands and repetitive tasks. Some are, in fact, very sophisticated, able to bypass most anti bot traffic systems performing click fraud or fake installs. 

Sophisticated bots are a subcategory of bad bots, but they are the worst because they mimic human traffic so well that it is hard to distinguish them even using special tools. In cinema terms, basic bots are extras on the set, and sophisticated bots demonstrate an Oscar-winning performance. What is vital is to detect bots and learn how to block bots (the bad ones, of course).

Types of traffic bots

Ways to avoid bot traffic, including sophisticated

1. Ads.TXT

An initiative created by IAB was the first step to programmatic transparency. Its main goal was to prevent domain spoofing and unsanctioned sale of inventory by unauthorized companies.

Basically, it is a txt file (not to be confused with robots.txt file) saved to the main folder of a website containing a list of companies authorized to sell this publisher’s inventory. It is beneficial both to advertisers (mainly online advertising networks) and publishers, as the first ones protect their own platform, and the latter can trace any network requests and where the web traffic was coming from at any given moment.

2. SELLERS.json

However, when several intermediaries participate in the sale of limited inventory or inventory in general, the network requests are even harder to follow, and ads.txt is no longer working in the role of bot management solution. This is when the next IAB initiative comes into play – SELLERS.json – a Javascript file installed to SSPs or AdExchanges. This file also provides all the parties with information as to whom they transact with.

3. SupplyChain Object

The latest transparency initiative from IAB at the moment is the SupplyChain Object. It provides an outlook on the whole supply chain: from seller ID to transactions concluded with them. This way, the buyer gets a complete image of all the players involved and thus can track down to the traffic source all the suspicious activities, including unauthorized bot traffic.

Together, these initiatives create a transparent digital ad buying process, providing a list of all the participants of any transaction. So, suppose at any given time, an advertiser notices suspicious bot activity/detecting bot traffic. In that case, they can easily track it down through all the intermediaries and down to the end seller, thus cutting off all the shifty players. A piece of quick advice: work only with digital ad partners who comply with IAB initiatives.  

4. Complex anti-fraud tools

When dealing with sophisticated bots, there are a few crucial moments to keep in mind: they react in milliseconds and do it in such ways so that it doesn’t scare away the real human users and real customers. So it is vital to find that bot management solution. 

But even if you decide to settle on one or a couple of the available bot managing solutions, you will have to test them out beforehand.

We have nothing against the trial-and-error method, but we think it is better to play it safe when it comes to potential serious budget losses and deciding how to stop bad bot traffic.

With the growing variety of malicious bot traffic, it is not practical to use only one system. It is way more effective to use complex tools with different “fighting techniques” to stop malicious bots, which will improve all the statistics, from search visibility, session duration, and organic website traffic to ad revenue. 

Even more convenient  — to use a tool that has all anti-fraud solutions in one like SmartHub.

SmartHub anti-fraud scanners to the rescue

SmartHub is a white label ready-to-use ad tech platform (like AdExchange) that helps unite the sell and buy sides in the most sophisticated way. With the user-friendly dashboard, smart optimization, and easy to grasp reports, this technology will save you a lot of time (and money) that you would otherwise spend on getting to know a number of other solutions. However, one of its key features is managing all types of bot traffic with traffic safety scanner providers. 

Fraud protection integration of SmartHub

So, whether you are an avid user of the technology or just deciding on the tech product for your business, you should know that SmartHub has probably one of the most exhaustive collections of bot management solutions on the market. They help to pinpoint all the bad bots and other suspicious actions to protect your marketplace.  

In SmartHub you will find time-tested instruments, like Pixalate, WhiteOps, and Forensiq that offer more innovative approaches to tracking and blocking the unwanted bot traffic. 

These are some of SmartHub’s traffic filtering procedures:

  • Mismatched IPs and Bundles throttling
  • IFA and IPv4 missing requests throttling
  • Secured traffic filtering
  • Adult traffic filtering
  • Blocked CRIDs
  • Blocked categories

SmartHub has this army of protective tools to foresee all the possible fraud schemes (like impression fraud)) and prevent the attack of the malicious bots from different angles. So that you can focus on media-trading process management, and your supply partners could focus on providing the best possible inventory (including unique limited inventory).

Afterword

So we have learned that bot traffic is the internet traffic coming from non-humans. The current situation with the bad bot traffic describes that its percentage will likely continue to grow for some time. Then eventually, the digital ad players will realize how much money they are losing/not earning because of the malicious bot traffic and will be reacting more actively. 

Publishers can start with managing bot traffic of basic levels, eliminating instruments like device identification or account protection. However, if you have to run your own ad network, we suggest a more global approach, like connecting available anti-fraud solutions in SmartHub to secure your marketplace once and for all. 

Want your client’s ads and traffic bot free? Sign up today!

Want to Learn More?

Error message
Error message
Error message
Choose your select
Ad Tech / Media Startup
Ad Network
Media Agency
Error message
Error message