Saltar al contenido
1592+ crawlers perfilados

Directorio de Crawlers

Todos los crawlers web y bots de IA que rastreamos. Qué hace cada uno, quién lo opera y cómo proteger su contenido.

Ozon Web Grabber — What It Is and How to Handle It
A component that serves to load previews for external and internal links. For external links, whenever possible, information from the open graph tags specified on the page (title, descr, images\video) is used, for references to internal objects, the internal representation is used (in the form of specialized blocks in the topic).
PreviewPreview
Opengraph Bot — What It Is and How to Handle It
Our API is used by mostly consumer facing products to preview links when sharing them on their platforms. For example, how when a link is shared on Facebook or Slack, those platforms provide a description/title/image to make the content more enticing.
PreviewPreview
Automaton — What It Is and How to Handle It
An end-to-end campaign and integration testing tool created to optimize your marketing, advertising and sales technology stack by ensuring setups are running as they should be.
MonitoringMonitoring
Adagio Bot — What It Is and How to Handle It
Adagio demand optimization solutions help publishers leverage unlimited demand sources at unprecedented revenue level, while improving user experience, SPO and carbon footprint.
MonitoringMonitoring
WebSpiderMount — What It Is and How to Handle It
webspidermount is a professional job wrapping service that extracts job listings from employer websites and distributes them to job boards, recruitment platforms, and job alert systems with real-time scraping capabilities.
AggregatorAggregator
WP Time Capsule — What It Is and How to Handle It
Our plugin WPTimeCapsule is installed in more 30000 WordPress sites. When our backup servers send requests to trigger the backup on the WordPress sites, it is being blocked
OtherOther
WARDBot — What It Is and How to Handle It
WARDBot is a website monitoring crawler that tracks URL status codes and monitors the availability of web pages to help users ensure their websites remain accessible and functional.
AI CrawlerAI Training
Daum — What It Is and How to Handle It
Daum is a major Korean search engine crawler operated by Kakao Corp, systematically indexing Korean websites and providing comprehensive search services including web, news, images, and local content for Korean users.
Search EngineSearch Engine
Google Scholar — What It Is and How to Handle It
Google Scholar uses a bot to crawl and index scholarly literature from academic publishers, repositories, and university websites. This populates its academic search engine.
Search EngineSearch Engine
Slickstream — What It Is and How to Handle It
Slickstream is a SaaS that indexes our customer's websites (with their approval) in order to provide engagement features for their site visitors, including site search, content recommendations, etc.
AdvertisingAdvertising
DigiCert DCV — What It Is and How to Handle It
DigiCert DCV (Domain Control Validation) is a security scanner bot operated by Cloudflare that validates domain ownership for SSL/TLS certificate issuance. It verifies that certificate applicants have control over the domains they want to secure.
SEO ToolSEO Tool
Stape Scanner — What It Is and How to Handle It
Stape Scanner is a monitoring crawler that checks website availability, performance, and health.
MonitoringMonitoring
All Africa Crawler — What It Is and How to Handle It
AllAfrica Global Media produces, aggregates and distributes news from across Africa, relying on agreements with more than 140 news organizations and over 500 other institutions and individuals. The AllAfrica NewsBot scrapes content from sites with whom AllAfrica has written agreements, or whose content is available without licensing restrictions or otherwise freely distributable. In all cases, the author and institution is credited in full.
Search EngineSearch Engine
LegalMonster — What It Is and How to Handle It
LegalMonster is a privacy compliance monitoring service that scans customer websites for cookie and privacy legislation compliance, helping businesses meet data protection requirements.
SecuritySecurity
Shortwave Image Fetcher — What It Is and How to Handle It
Shortwave Image Fetcher is a privacy-focused service that proxies images from HTML emails through Shortwave's servers to protect users' IP addresses and maintain email privacy when viewing content in their AI-powered email client.
PreviewPreview
SkroutzBot — What It Is and How to Handle It
SkroutzBot is a specialized web crawler used by Skroutz to download XML data feeds, product images, and sample product pages from merchant websites for quality control and product catalog management.
Feed ReaderFeed Reader
SlackbotLinkExpanding — What It Is and How to Handle It
This robot responds to links that Slack users post into their channels. It fetches as little of the page as it can (using HTTP Range headers) to extract meta tags about the content.
PreviewPreview
Cloudflare Radar URL Scanner — What It Is and How to Handle It
Cloudflare Radar URL Scanner is a security scanner that checks websites for vulnerabilities, misconfigurations, and security issues.
SecuritySecurity
WMF Citoid — What It Is and How to Handle It
Citoid is a Wikimedia service in VisualEditor that generates citations from URLs, DOIs, and ISBNs, relying on the Zotero Translation Server (see wikimedia-zotero) for accurate metadata, processed on demand from website visitors.
OtherOther
cognitiveSEO Crawler — What It Is and How to Handle It
cognitiveSEO Crawler audits websites for technical SEO issues and backlink analysis as part of the cognitiveSEO platform.
SEO ToolSEO Tool
Cookie Hub — What It Is and How to Handle It
Cookie Hub is a security scanner that checks websites for vulnerabilities, misconfigurations, and security issues.
SecuritySecurity
Yahoo Link Preview — What It Is and How to Handle It
Yahoo Link Preview bot fetches metadata and preview data from URLs shared on Yahoo platforms and services. Supports Yahoo's social features, news aggregation, and content discovery across their ecosystem of web properties.
PreviewPreview
IsDownBot — What It Is and How to Handle It
IsDownBot monitors over 4,400+ cloud vendors and services to detect outages and downtime, aggregating status page information and providing real-time notifications to help teams respond quickly to service disruptions.
MonitoringMonitoring
MontasticMonitor — What It Is and How to Handle It
MontasticMonitor is a monitoring crawler that checks website availability, performance, and health.
MonitoringMonitoring
kb.dk_bot — What It Is and How to Handle It
kb.dk_bot is the web archiving crawler for Netarkivet, the Danish national web archive operated by the Royal Danish Library.
ResearchResearch
Rakuten Image extraction bot — What It Is and How to Handle It
Rakuten Image extraction bot is Rakuten's crawler that extracts product images for their e-commerce marketplace.
AggregatorAggregator
Nooshub — What It Is and How to Handle It
Nooshub is a modern RSS reader bot that fetches RSS and Atom feeds and uses learning algorithms to group similar articles, remove clutter, and identify trending topics.
Feed ReaderFeed Reader
Google Publisher Center — What It Is and How to Handle It
Google Publisher Center is a feed reader that fetches and processes RSS, Atom, and other content feeds.
Feed ReaderFeed Reader
KargoBot-Artemis — What It Is and How to Handle It
KargoBot-Artemis is Kargo's autonomous content verification bot that simulates iOS device user behavior to scan websites for content quality and suitability, ensuring brand safety for advertisers on the Kargo ad network.
MonitoringMonitoring
Attracta — What It Is and How to Handle It
Attracta is a web crawler that analyzes website content and structure to provide SEO optimization services, including link building strategies and search engine ranking improvements.
SEO ToolSEO Tool
AnteriorPágina 3 de 54Siguiente