首页 > scrapy爬虫循环页码这样对吗?

scrapy爬虫循环页码这样对吗?

scrapy爬虫,要爬取1-100页的内容,用循环把网址生成,代码如下:

def start_requests(self):
    pages=[]
    for i in range(1,100):
    newpage=scrapy.Request("http://www.yyyy.com/yyy/yyy-list.php?page=%s"%i)
    pages.append(newpage)
return pages

这样对吗?


import scrapy

url_prefix = "http://www.yyyy.com/yyy/yyy-list.php?page={}"

class YyyySpider(scrapy.spiders.Spider):

name = "Yyyy"
allowed_domains = ["yyyy.com"]
start_urls = [
     url_prefix.format(i) for i in range(1,101)
]

def parse(self, response):
    filename = response.url.split("/")[-2]
    with open(filename, 'wb') as f:
        f.write(response.body)

大概可以这样

【热门文章】
【热门文章】