Managing bot traffic
Web resources process a significant portion of bot traffic. Some of these bots are legitimate and essential for the operation of various web services. Among such bots, there are web crawlers
Smart Web Security offers flexible protection and bot traffic filtering tools:
- List of verified bot categories by intended use or type of activity (AcademicResearchBot, AISearchBot).
- Up-to-date lists of legitimate bots used by various services and companies (e.g., Yandex, Googlebot, or Bing).
- Specific attribute to distinguish a verified bot.
- Configurable bot score thresholds from 0 to 100 for rule customization.
You can use the above traffic conditions in security profile (basic, Smart Protection, or WAF) and ARL profile rules.
To further fine-tune your security rules, check logs for traffic filtering results and bot scores.
Verified bot categories
AcademicResearchBot: Collect web data used in academic research; these bots help automate gathering information for analysis and scientific study.AccessibilityBot: Improve web content accessibility for users with special needs, e.g., bots that help read out text on websites.AdvertisingAndMarketingBot: Support advertising and marketing campaigns.AggregatorBot: Gather and distribute information, e.g., bots for aggregating articles from news websites.AIAssistantBot: AI-based assistants involved in a wide range of tasks.AICrawlerBot: Help train and improve AI models, e.g., bots that collect data for training machine learning algorithms.AISearchBot: AI-driven bots optimized for interactive search and delivering information in response to user queries.ArchiverBot: Capture and archive snapshots of web pages and other web resources.FeedFetcherBot: Fetch data from various regularly updated sources, e.g., news feeds and blogs, helping users receive up-to-date information.MonitoringAndAnalyticsBot: Collect website analytics. These bots monitor website metrics, e.g., traffic, load speed, or error rates, to provide analytical data to improve site performance.PagePreviewBot: Generate page previews for links shared by users in messenger apps or social media.SearchEngineCrawlerBot: Scan the internet and index pages for search engines. They help systems, such as Yandex, update their databases and deliver up-to-date search results.SearchOptimizationBot: Help optimize websites and improve their visibility in search engines by analyzing pages for errors and suggesting changes to page content and structure.SecurityBot: Check websites for vulnerabilities and various security threats.SocialMediaMarketingBot: Help manage brand presence on social platforms by automating posting, moderation, and replies to users as well as collecting analytics for SMM campaigns and assessing their performance.WebhooksBot: Automate processes using real-time technologies that enable web applications to interact with each other.OthersBot: Other categories of verified bots.
List of legitimate bots
Below is a list of bots used by various businesses for content indexing, delivering information in response to user queries, SEO, and AI model training. The list is updated regularly. You can use this list to configure traffic conditions tailored to your particular service or application.
Legitimate bots from trusted services and companies
YandexBot: Yandex's primary indexing bot.YandexAccessibilityBot: Checks website accessibility and performance.YandexAdNetBot: Advertising bot.YandexBlogs: Indexes blog posts for Yandex Search.YandexMirrorDetectorBot: Detects website mirrors.YandexCalendar: Downloads calendar files in response to user queries. These files are often located in directories excluded from indexing.YandexDirect: Fetches content info from partner websites in the Yandex Advertising Network to define website categories and select relevant ads.YandexFavicons: Downloads website icons (favicon.ico) to display them in search results.YaDirectFetcher: Yandex bot that fetches ad pages to check their availability and update a category.YandexForDomain: Verifies domain ownership.YandexImages: Indexes images for Yandex Search.YandexImageResizer: Gathers and processes website images for display across devices and formats.YandexMobileBot: Identifies pages optimized for mobile devices.YandexMarket: Yandex Market bot for retrieving up-to-date product information from online stores.YandexMedia: Indexes multimedia data for Yandex Search.YandexMetrika: Loads website pages, including Yandex Direct ads, to check their availability.YandexMobileScreenShotBot: Captures screenshots of web pages for mobile devices.YandexNews: Aggregates news.YandexOntoDB: Generates response cards in Yandex Search.YandexPagechecker: Accesses a page when microdata validation is initiated via the structured data testing tool.YandexPartner: Retrieves content info from Yandex partner sites.YandexRCA: Gathers data for generating page previews.YandexRenderResourcesBot: Loads resources for JavaScript page rendering.YandexSearchShop: Downloads product catalog YML files in response to user queries. These files are often located in directories restricted for indexing.YandexSitelinks: Verifies availability of pages used as sitelinks.YandexSpravBot: Yandex Business bot.YandexTracker: Yandex Tracker bot.YandexUserproxy: Manages user-initiated actions across Yandex services, e.g., sends requests in response to button clicks or loads pages for online translation.YandexVertis: Search bot for Yandex Verticals.YandexVerticals: Bot for classifieds platforms: Auto.ru, Yandex Realty, Yandex Jobs, and Yandex Reviews.YandexVideo: Indexes videos for display in search results.YandexWebmaster: Yandex Webmaster SEO bot.YandexScreenshotBot: Bot for fast and easy webpage screenshots.YandexAdditionalBot: Generates quick responses with YandexGPT.YandexComBot: Indexes content for Yandex search in languages other than Russian.ZenBot: Yandex Zen bot for aggregating news from customer portals.MailruBot: Mail.ru bot.Googlebot: Google bot for website indexing.StorebotGoogle: Google bot for indexing online stores.GoogleInspectionTool: Google bot used in website testing tools.GoogleOther: Google general-purpose bot used by various product teams for fetching publicly accessible content from sites.GoogleCloudVertexBot: Google bot used by website owners to build Vertex AI agents.GoogleExtended: Google bot for AI-powered search.APIsGoogle: Sends push notifications via Google APIs.AdsBotGoogle: Google bot that checks web page ad quality.MediapartnersGoogle: Google bot that scans AdSense participating websites to provide them with relevant ads.GoogleSafety: Google bot that handles abuse-specific crawling, such as malware discovery for publicly posted links in Google products.FeedFetcherGoogle: Scans RSS or Atom feeds for Google News and PubSubHubbub.GoogleProducer: Fetches and processes feeds explicitly provided by publishers for Google News landing pages.GoogleReadAloud: Retrieves and reads web pages using text-to-speech upon user request.GoogleSiteVerification: Retrieves Search Console verification tokens to confirm site ownership in Google.Bingbot: Indexes web content for Microsoft.AdIdxBot: Bing Ads search bot that crawls ads and follows the websites from those ads for quality control.BingPreview: Generates page previews in Bing.MicrosoftPreview: Generates page previews for Microsoft services.Amazonbot: Amazon search bot used to improve service quality, e.g., helping Alexa to provide more accurate answers to customer questions.Applebot: Collects data for tools integrated into user experiences in Appleʼs ecosystem (Spotlight, Siri, Safari), e.g., the search technology.FacebookExternalHit: Collects, caches, and displays website or app metadata, including its title, description, and icon (thumbnail image).MetaExternalAgent: Crawls web content to train AI models and improve products through direct indexing.Meta‑ExternalFetcher: Enables Meta AI to provide users with up-to-date information beyond its training data.Pinterestbot: Crawls publicly accessible websites to index content and drive traffic, checks that pin data, e.g., its price and title, is up-to-date, and removes broken links.Qwantbot: Indexes web content for Qwant.CCBot: Archives web pages forcommoncrawl.org.YahooSlurpBot: Yahoo's primary indexing bot.YahooLinkPreviewBot: Generates page previews for Yahoo.YahooMailProxyBot: Generates link previews in Yahoo Mail messages.YahooAdMonitoring: Fetches content from URLs specified in Yahoo ad services; uses target page content to improve ad relevance and user experience.Pingdom: Performs automated website testing and monitoring.SEMrushBot: SEO crawler by SEMrush.OdklBot: Bot by the Odnoklassniki social media platform.vkShareBot: Implements the Share widget on web pages: clicking the widget automatically creates a post in the VK social media platform with a link to the site.Twitterbot: Generates page previews in X.TelegramBot: Generates page previews in Telegram.SeznamBot: Indexes web content for Seznam.WebArchiveBot: Archives web pages forarchive.org.DuckDuckBot: Indexes web content for DuckDuckGo.PetalBot: Indexes web content for Huawei's Petal Search.AhrefsBot: Indexes web content for the Yep search engine.AhrefsSiteAudit: SEO bot for the Ahrefs Site Audit tool.DataForSEOBot: SEO bot for DataForSEO.SeekportBot: Indexes web content for the Seekport search engine.Serpstatbot: SEO bot for Serpstat.GPTBot: Bot for training ChatGPT models.ChatGPTUser: Visits web pages specified by a ChatGPT user.OAISearchBot: Powers smart search in ChatGPT.Synthetics: Performs automated website testing and monitoring.UptimeRobot: Performs automated testing and monitoring.BLEXBot: SEO bot for SE Ranking.BaiduSpider: Indexes web content for the Chinese Baidu search engine.Stripe: Automates interactions with the Stripe payment platform.ClaudeBot: Bot for training Claude models.Claude‑User: Enables Claude AI to access websites via the Claude-User agent when processing user queries.Claude‑SearchBot: Analyzes online content to improve the relevance and accuracy of user queries.PerplexityBot: Bot for training Perplexity models.Perplexity‑User: Visits web pages specified by a Perplexity user.QcBot: Bot for QUIC.cloud web services which require data to be fetched from and returned to your website. For example, the bot retrieves images from your site for optimization and notifies your site when the updated images are ready to be pulled.AudistoBot: SEO and website monitoring bot.CoccocBot: Primary indexing for the CốC CốC search engine.IASCrawlerBot: Digital ad analysis and verification bot.IbouBot: Ibou's primary indexing bot.JobswithgptcomBot: Aggregates job postings forjobswithgpt.com.MonsidoBot: Monitors your website's status using Monsido.PaqleBot: Paqle's primary indexing bot.QuantcastBot: Analyzes ads displayed on websites.SvnBot: Monitors site availability.TestomatoBot: Monitors site availability.PingAdminBot: Monitors site availability.Meta‑WebIndexerBot: Crawls web resources to improve the quality of Meta AI search results for users.Meta‑ExternalAdsBot: Crawls web content for use cases such as improving advertising and other business-related products and services.
Bot score
To configure custom rules for your traffic, you can define filtering rules based on bot score.
Smart Web Security assigns traffic a score from 0 (lowest probability, i.e., human) to 100 (highest probability, i.e., bot).
The following threshold ranges are used in decision-making:
up to 20: Human20–40: Likely human40–60: Undetermined60–80: Likely botover 80: Bot
In request filtering conditions, specify threshold values using the >=, <=, =, or != operators and the AND logical operator.
For example, =10, >=20 AND <=40.
For final tuning, apply the rule in only logging mode and analyze logs to determine the optimal threshold.