快捷導(dǎo)航

Scrapy爬蟲文件批量運(yùn)行的實(shí)現(xiàn)

更新時(shí)間：2020年09月30日 10:31:15 作者：SteveForever

這篇文章主要介紹了Scrapy爬蟲文件批量運(yùn)行的實(shí)現(xiàn)，文中通過示例代碼介紹的非常詳細(xì)，對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值，需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧

Scrapy批量運(yùn)行爬蟲文件的兩種方法：

1、使用CrawProcess實(shí)現(xiàn)

https://doc.scrapy.org/en/latest/topics/practices.html

2、修改craw源碼+自定義命令的方式實(shí)現(xiàn)

（1）我們打開scrapy.commands.crawl.py 文件可以看到：

 def run(self, args, opts):
    if len(args) < 1:
      raise UsageError()
    elif len(args) > 1:
      raise UsageError("running 'scrapy crawl' with more than one spider is no longer supported")
    spname = args[0]

    self.crawler_process.crawl(spname, **opts.spargs)
    self.crawler_process.start()

這是crawl.py 文件中的run() 方法，在此可以指定運(yùn)行哪個(gè)爬蟲，要運(yùn)行所有的爬蟲，則需要更改這個(gè)方法。

run() 方法中通過crawler_process.crawl(spname, **opts.spargs) 實(shí)現(xiàn)了爬蟲文件的運(yùn)行，spname代表爬蟲名。要運(yùn)行多個(gè)爬蟲文件，首先要獲取所有的爬蟲文件，可以通過crawler_process.spider_loader.list() 實(shí)現(xiàn)。

（2）實(shí)現(xiàn)過程：

a、在spider目錄的同級(jí)目錄下創(chuàng)建存放源代碼的文件夾mycmd，并在該目錄下創(chuàng)建文件mycrawl.py；

b、將crawl.py 中的代碼復(fù)制到mycrawl.py 文件中，然后進(jìn)行修改：

#修改后的run() 方法
  def run(self, args, opts):
    #獲取爬蟲列表
    spd_loader_list = self.crawler_process.spider_loader.list()
    #遍歷各爬蟲
    for spname in spd_loader_list or args:
      self.crawler_process.crawl(spname, **opts.spargs)
      print("此時(shí)啟動(dòng)的爬蟲："+spname)
    self.crawler_process.start()

同時(shí)可以修改：

 def short_desc(self):
    return "Run all spider"

c、在mycmd文件夾下添加一個(gè)初始化文件__init__.py，在項(xiàng)目配置文件（setting.py）中添加格式為“COMMANDS_MODULES='項(xiàng)目核心目錄.自定義命令源碼目錄'”的配置；

例如：COMMANDS_MODULE = 'firstpjt.mycmd'

隨后通過命令“scrapy -h”，可以查看到我們添加的命令mycrawl

這樣，我們就可以同時(shí)啟動(dòng)多個(gè)爬蟲文件了，使用命令：

scrapy mycrawl --nolog

到此這篇關(guān)于Scrapy爬蟲文件批量運(yùn)行的實(shí)現(xiàn)的文章就介紹到這了,更多相關(guān)Scrapy 批量運(yùn)行內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章:

Python爬蟲框架Scrapy實(shí)戰(zhàn)之批量抓取招聘信息

欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片

Scrapy爬蟲文件批量運(yùn)行的實(shí)現(xiàn)

相關(guān)文章

最新評(píng)論

大家感興趣的內(nèi)容

最近更新的內(nèi)容

常用在線小工具