# 爬虫数据如何识别？

易观方舟JS SDK 会根据数据中的 UA 信息来识别的是上报的事件是否是爬虫产生的，方便在分析时根据事件中的爬虫属性来过滤掉这部分非真实用户产生的数据。

## 识别爬虫的正则表达式

```
`/(bot|crawler|spider|scrapy|dnspod|ia_archiver|jiankongbao|slurp|transcoder|networkbench|oneapm|PhantomJS|BingPreview)/i`
```

## &#x20;常见的爬虫来源

1. **百度：**&#x42;aiduspider  Mozilla/5.0 (compatible; Baiduspider/2.0;+<http://www.baidu.com/search/spider.html>)
2. **百度图片：**&#x42;aiduspider-image+(+<http://www.baidu.com/search/spider.htm>)
3. **百度PC：** Mozilla/5.0 (compatible; Baiduspider-render/2.0; +<http://www.baidu.com/search/spider.html>
4. **百度移动端：**&#x4D;ozilla/5.0 (iPhone; CPU iPhone OS 9\_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; Baiduspider-render/2.0; +<http://www.baidu.com/search/spider.html>)
5. **谷歌：**&#x47;ooglebot   Mozilla/5.0 (compatible; Googlebot/2.1; +<http://www.google.com/bot.html>) &#x20;
6. **Google图片：**&#x41;dsBot-Google-Mobile (+<http://www.google.com/mobile/adsbot.html>) Mozilla (iPhone; U; CPU iPhone OS 3 0 like Mac OS X) AppleWebKit (KHTML, like Gecko) Mobile Safari
7. **360蜘蛛：**&#x33;60Spider 360搜索  Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0);
8. **360网站安全：**&#x33;60spider ([http://webscan.360.cn](http://webscan.360.cn/))
9. Bing爬虫: bingbot  Mozilla/5.0 (compatible; bingbot/2.0; +<http://www.bing.com/bingbot.htm>)
10. **腾讯搜搜蜘蛛：**&#x53;osospider  Sosospider+(+<http://help.soso.com/webspider.htm>)
11. **搜搜图片：** Sosoimagespider+(+<http://help.soso.com/soso-image-spider.htm>)
12. **雅虎蜘蛛：**&#x59;ahoo!   雅虎英文 Mozilla/5.0 (compatible; Yahoo! Slurp; <http://help.yahoo.com/help/us/ysearch/slurp>)
13. **雅虎中国：**&#x4D;ozilla/5.0 (compatible; Yahoo! Slurp China; <http://misc.yahoo.com.cn/help.html>)
14. **有道蜘蛛：**&#x59;oudaoBot Mozilla/5.0 (compatible; YoudaoBot/1.0; <http://www.youdao.com/help/webmaster/spider/>; )
15. **搜狗蜘蛛：**&#x53;ogou News Spider   Sogou web spider/4.0(+<http://www.sogou.com/docs/help/webmasters.htm#07>)
16. **搜狗图片：**&#x53;ogou Pic Spider/3.0(+<http://www.sogou.com/docs/help/webmasters.htm#07>)
17. **瑞典 Speedy Spider:** Speedy Spider  Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) Speedy Spider (<http://www.entireweb.com/about/search_tech/speedy_spider/>)
18. **俄罗斯 yandex :** YandexBot Mozilla/5.0 (compatible; YandexBot/3.0; +<http://yandex.com/bots>)
19. **MSN蜘蛛：**&#x6D;snbot/msnbot-media msnbot/1.1 (+<http://search.msn.com/msnbot.htm>)
20. **必应蜘蛛：**&#x62;ingbot/compatible Mozilla/5.0 (compatible; bingbot/2.0; +<http://www.bing.com/bingbot.htm>)
21. **听云爬虫：**&#x6E;etworkbench Mozilla/5.0 (Windows NT 10.0; Trident/7.0; rv: 11.0;NetworkBench/[8.0.1.309](http://con.analysys.cn/8.0.1.309)-5774440-2481662) like Gecko
22. **Alexa蜘蛛：**&#x69;a\_archiver ia\_archiver/8.9 (Windows NT 3.1; en-US;)
23. **宜sou蜘蛛：**&#x45;asouSpider Mozilla/5.0 (compatible; EasouSpider; +<http://www.easou.com/search/spider.html>)
24. **华为赛门铁克蜘蛛：**&#x48;uaweiSymantecSpider  HuaweiSymantecSpider/1.0+DSE-<support@huaweisymantec.com+(compatible>; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR ; <http://www.huaweisymantec.com/cn/IRL/spider>)
25. **七牛镜像蜘蛛：**&#x71;iniu qiniu-imgstg-spider-1.0
26. **DNSPod监控：**&#x44;NSPod  DNSPod-Monitor/2.0
27. **俄罗斯 LinkpadBot：**&#x4C;inkpadBot   Mozilla/5.0 (compatible; LinkpadBot/1.06; +[http://www.linkpad.ru](http://www.linkpad.ru/))
28. **英国 MJ12bot：**&#x4D;J12bot   Mozilla/5.0 (compatible; MJ12bot/v1.4.0; <http://www.majestic12.co.uk/bot.php?+>)
29. **即刻蜘蛛：**&#x4A;ikeSpider
30. **一淘网蜘蛛：**&#x45;taoSpider Mozilla/5.0 (compatible; EtaoSpider/1.0; EtaoSpider)
31. **人工智能爬虫：**&#x63;rawler Mozilla/5.0 (compatible; 008/0.83; <http://www.80legs.com/webcrawler.html>) Gecko/2008032620
32. **Scrapy爬虫：** scrapy
33. **监控宝：**&#x6A;iankongbao Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; JianKongBao Monitor 1.1)
34. **OneAPM爬虫：**&#x4F;neAPM FFAgent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0: OneAPM FFAgent)Gecko/20100101 Firefox/39.0
35. **PhantomJS：**&#x50;hantomJS  Mozilla/5.0 (Unknown; Linux x86\_64)AppleWebKit/538. 1 (KHTML,like Gecko)PhantomJS/2.1.1 Safari/538.1
36. **BingPreview：**&#x4D;ozilla / 5.0 + (Windows + NT + 6.1; + WOW64) + AppleWebKit / 534++(KHTML, +like + Gecko) + BingPreview / 1.0 b


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://arkdocs.analysys.cn/integration/sdk/sdk-faq/spider.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
