Managing and Mitigating Bots

Book description

Automated traffic is a major factor in the modern internet, especially for website owners. Bots, spiders, and scrapers now exceed the amount of traffic from human users, and this trend is expected to continue in the number and sophistication of bot types. With this handbook, Intechnica co-founder Andy Still provides guidelines to help public-facing sites identify and manage various types of automated traffic.

Some bots are beneficial, such as automated traffic triggered by direct human action, but many of them are highly profitable ventures run by organized criminal groups. Bots can negatively impact your site and your business by:

  • Fraudulently taking advertising revenue without displaying ads to potential customers
  • Stealing website content for competitors to use on their own sites
  • Publishing offensive content on your comments or forum pages
  • Accessing your users’ personal data for use elsewhere
  • Creating fake accounts to take unfair advantage of special offers
  • Skewing analytics that would lead you to make invalid business decisions

Managing and Mitigating Bots helps you determine which traffic is non-human and requires action, and then guides you through the process of defining a bot-handling policy tailored to the type of bot traffic you identify.

Table of contents

  1. Introduction
  2. I. Background
  3. 1. What Is Automated Traffic?
    1. Key Characteristics of Automated Traffic
      1. Web-based Systems
      2. Layer 7
      3. Legitimate Requests
    2. Exclusions
      1. DDoS (Distributed Denial of Service)
      2. Security Vulnerability Exploits
  4. 2. Misconceptions of Automated Traffic
    1. Misconception: Bots Are Just Simple Automated Scripts
    2. Misconception: Bots Are Just a Security Problem
    3. Misconception: Bot Operators Are Just Individual Hackers
    4. Misconception: Only the Big Boys Need to Worry About Bots
    5. Misconception: I Have a WAF, I Don’t Need to Worry About Bot Activity
  5. 3. Impact of Automated Traffic
    1. Company Interests
    2. Other Users
    3. System Security
    4. Infrastructure
  6. II. Types of Automated Traffic
  7. 4. Malicious Bots
    1. Application DDoS
  8. 5. Data Harvesting
    1. Search Engine Spiders
    2. Content Theft
    3. Price Scraping
    4. Content/Price Aggregation
    5. Affiliates
    6. User Data Harvesting
  9. 6. Checkout Abuse
    1. Scalpers
    2. Spinners
    3. Inventory Exhaustion
    4. Snipers
    5. Discount Abuse
  10. 7. Credit Card Fraud
    1. Card Validation
    2. Card Cracking
    3. Card Fraud
  11. 8. User-Generated Content (UGC) Abuse
    1. Content Spammer
  12. 9. Account Takeover
    1. Credential Stuffing/Credential Cracking
    2. Account Creation
    3. Bonus Abuse
  13. 10. Ad Fraud
    1. Background to Internet Advertising
    2. Banner Fraud
    3. Click Fraud
    4. CPA Fraud
    5. Cookie Stuffing
    6. Affiliate Fraud
    7. Arbitrage Fraud
  14. 11. Monitors
    1. Availability
    2. Performance
    3. Other
  15. 12. Human-Triggered Automated Traffic
  16. III. How to Effectively Handle Automated Traffic in Your Business
  17. 13. Identifying Automated Traffic
    1. Indications of an Automated Traffic Problem
    2. Challenges
    3. Generation 0: Genesis—robots.txt
    4. Generation 1: Simple Blocking—Blacklisting and Whitelisting
    5. Generation 2: Early Bot Identification—Symptom Monitoring
    6. Generation 3: Improved Bot Identification—Real User Validation
      1. Real Browser Validation
      2. Fingerprinting
    7. Generation 4: Sophisticated Bot Identification—Behavioral Analysis
  18. 14. Managing Automated Traffic
    1. Blocking
    2. Validation Requests
      1. Offline Validation
      2. Inline Validation
    3. Alternative Servers/Caching
    4. Alternative Content
  19. Conclusion

Product information

  • Title: Managing and Mitigating Bots
  • Author(s): Andy Still
  • Release date: April 2018
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492029373