如何通过UserAgent识别谷歌爬虫
- 2016-04-23 15:12:00
- admin 原创
- 4793
如何通过UserAgent识别谷歌爬虫?
我们可以通过UserAgent来查看那些谷歌蜘蛛在爬取我们的网站。
“爬虫”是一个通用术语,用于任何程序(如机器人或蜘蛛),用于自动发现和扫描网站,通过以下链接从一个网页到另一个。谷歌的爬虫叫做Googlebot。下表列出了常见的谷歌爬虫,您可能会看到您的参考日志信息,以及他们应该如何指定在robots.txt文件,robots meta标签,和x-robots-tag HTTP指令。
谷歌爬虫 | Useragent标识 | 完整的UserAgent (在网站日志中看的的) |
---|---|---|
Googlebot (Google Web search) | Googlebot | Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) or (rarely used): Googlebot/2.1 (+http://www.google.com/bot.html) |
Googlebot News | Googlebot-News (Googlebot) |
Googlebot-News |
Googlebot Images | Googlebot-Image (Googlebot) |
Googlebot-Image/1.0 |
Googlebot Video | Googlebot-Video (Googlebot) |
Googlebot-Video/1.0 |
Google Mobile (feature phone) | Googlebot-Mobile |
|
Google Smartphone | Googlebot | Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) |
Google Mobile AdSense | Mediapartners-Google or Mediapartners (Googlebot) |
[various mobile device types] (compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html) |
Google AdSense | Mediapartners-Google Mediapartners (Googlebot) |
Mediapartners-Google |
Google AdsBot landing page quality check | AdsBot-Google | AdsBot-Google (+http://www.google.com/adsbot.html) |
Google app crawler (Used to fetch resources for mobile apps, obeys AdsBot-Google robots rules.) |
AdsBot-Google-Mobile-Apps | AdsBot-Google-Mobile-Apps |
发表评论