多次重现请求报 FETCH_ERROR ,点开后就已经显示处于 ACTIVE 状态,其中 result 部分是已经获取到了的,但一直没有变成 SUCCESS
其中一例
详情页 http://127.0.0.1:5000/task/zhihu:a27faabd644bc96a99af8650087c34bf
所显示的内容如下
ACTIVE
zhihu.detail_page > http://www.zhihu.com/question/23890341 (1 hour ago updated )
taskid
a27faabd644bc96a99af8650087c34bf
lastcrawltime
1455451379.72 (1 hour ago)
updatetime
1455451379.72 (1 hour ago)
exetime
1455451379.72 (1 hour ago)
track.fetch ❌ 100.92ms
{
"content": "",
"encoding": "unicode",
"error": "HTTP 429: Unknown",
"headers": {
"Connection": "keep-alive",
"Content-Length": "0",
"Date": "Sun, 14 Feb 2016 12:02:41 GMT",
"Server": "ZWS",
"Set-Cookie": "aliyungf_tc=AQAAANY8a2GiVQ4Ax5ElZxA/ampqX420; Path=/; HttpOnly",
"Vary": "Accept-Encoding",
"X-Req-Id": "73E2399956C06CCD"
},
"ok": false,
"redirect_url": null,
"status_code": 429,
"time": 0.1009221076965332
}
track.process ❌ 0.15ms
HTTP 429: Unknown
[E 160214 20:02:36 base_handler:194] HTTP 429: Unknown
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/pyspider/libs/base_handler.py", line 187, in run_task
result = self._run_task(task, response)
File "/usr/local/lib/python2.7/dist-packages/pyspider/libs/base_handler.py", line 166, in _run_task
response.raise_for_status()
File "/usr/local/lib/python2.7/dist-packages/pyspider/libs/response.py", line 183, in raise_for_status
raise http_error
HTTPError: HTTP 429: Unknown
{
"exception": "HTTP 429: Unknown",
"follows": 0,
"logs": "[E 160214 20:02:36 base_handler:194] HTTP 429: Unknown\n Traceback (most recent call last):\n File \"/usr/local/lib/python2.7/dist-packages/pyspider/libs/base_handler.py\", line 187, in run_task\n result = self._run_task(task, response)\n File \"/usr/local/lib/python2.7/dist-packages/pyspider/libs/base_handler.py\", line 166, in _run_task\n response.raise_for_status()\n File \"/usr/local/lib/python2.7/dist-packages/pyspider/libs/response.py\", line 183, in raise_for_status\n raise http_error\n HTTPError: HTTP 429: Unknown\n",
"ok": false,
"result": null,
"time": 0.0001518726348876953
}
schedule
{
"exetime": 1455451379.719501,
"itag": "v1",
"priority": 1,
"retried": 1
}
process
{
"callback": "detail_page"
}
fetch
{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, sdch",
"Accept-Language": "en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4,zh-TW;q=0.2",
"Connection": "keep-alive",
"Host": "www.zhihu.com",
"Referer": "http://www.zhihu.com/",
"User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2357.130 Safari/537.36"
}
}
result
{
"answerCount": "9",
"question": "预算5000买HD650还是AKG702或者其他?",
"questionDetail": "(quora上耳机问题不多,还是知乎上经常有大神出没) MBP/iMac用户,因为工作特性,在办公室也经常有人找,不会长时间佩戴。我05年买的一个森海塞尔PX200用了9年,只换过一次皮套,对这个牌子有好感,so近来有了配一个中档开放式耳机的想法,首选也是森海塞尔HD650,看到知乎上很多人还推荐AKG702、拜亚DT880,B&W,有开始担心森海塞尔这几年水军太多。对耳机懂一些理论知识,没实战经验,深知自己门外汉,已经陷入中毒分析性瘫痪,时间有限、经济能力有限也没法一一去发烧,甚憾,因此力求发烧友和业内人士推荐解决方案! 总结一下: 1,想要一步到位,买个好耳机再听10年。 2,耳机+耳放,最高预算¥5000左右。 3,音源主要是Macbook Pro和iMac,不喜欢用手机听音乐。 4,在办公室听,就那种IT业常见的大车间,环境比较嘈杂,没有便携需求(所谓出街) 5,工作时背景音乐喜欢听古典音乐,休息专心听歌时偏好女声。",
"topAnswer": "HD650你可以聽10年。 MBP和iMac的話,你需要一台解碼器+耳放,因為MBP和iMac本身的音質不佳,倒不僅僅是驅動力夠不夠的問題。比較省事的就「節奏坦克幻想曲D/A」好了。 未來合適的時候可考慮把耳機的線升級,再加一台電子管耳放,可以得到更溫暖的聲音。電子管耳放最好是OTL設計(也即所謂 無輸出變壓器 的設計),對HD650來說,有更寬廣的動態範圍,和更快的瞬態響應速度。 祝你玩機愉快。",
"topAnswerAuthor": "堂主",
"topAnswerAuthorUrl": "http://www.zhihu.com/people/briansun",
"topAnswerCount": "23",
"url": "http://www.zhihu.com/question/23890341",
"watchCount": "70"
}
最近一次的抓取请求失败, 但是 result 不一定是最近一轮抓取获得的. 以前成功过, result 是无法体现的