用scrapy写爬虫,模拟登录只能自己写?还是scrapy有模拟登录的相关方法?
下面是手册中的代码示例,不太懂:
import scrapy
class LoginSpider(scrapy.Spider):
name = 'example.com'
start_urls = ['http://www.example.com/users/login.php']
def parse(self, response):
return scrapy.FormRequest.from_response(
response,
formdata={'username': 'john', 'password': 'secret'},
callback=self.after_login
)
def after_login(self, response):
# check login succeed before going on
if "authentication failed" in response.body:
self.logger.error("Login failed")
return
# continue scraping with authenticated session...
初始化爬虫预先post到登陆接口,然后保持cookies