学习scrapy,尝试爬取京东某显卡价格
sel.xpath('//*[@id="summary-price"]/div[2]/strong/text()').extract()
爬取结果为空,在浏览器右键查看源代码发现strong里无文本,请问这是怎么回事呢?是网页有什么反爬机制吗?
<div id="summary-price">
<div class="dt">京 东 价:</div>
<div class="dd">
<strong class="p-price" id="jd-price"></strong>
<a data-type="1" data-sku="1430305" id="notice-downp" class="J-notify-1" href="#none" clstag="shangpin|keycount|product|jiangjia_1">(降价通知)</a>
</div>
</div>
因为价格是单独发送Ajax去获取的.
请求的地址类似于下面这样的:http://p.3.cn/prices/get?skuid=J_1192826&type=1&area=1_72_2799&callback=cnp
返回的内容类似于下面这样:
cnp([{"id":"J_1192826","p":"899.00","m":"1199.00"}]);
然后 cnp
函数是这样定义的:
function cnp(a) {
var b = "";
if (dCashDescInfo.loadPriceCnt++, a && a.length > 0) {
var c = a[0].p;
var d = a[0].m;
var e = new Number(c);
var f = new Number(d);
if (pageConfig.product.jp = e, pageConfig.product.mp = f, e > 0) {
b = "\uffe5" + c; {
pageConfig.product.cat[0]
}
$("#summary-price .p-discount").html(G.discount(e, f)),
dCashDescInfo.bigger39 = 39 > e ? !1 : !0, itemEasyBuy.bigger10 = 10 > e ? !1 : !0, newEasyBuyInit()
}
}
b ? ($("#summary-price").find(".p-discount, .pricing").show(), $("#page_maprice").html("\uffe5" + d)) : $("#summary-price").find(".p-discount, .pricing").hide(), b || (b = "\u6682\u65e0\u62a5\u4ef7"), $("#summary-price .p-price, #mini-jd-price").html(b), pageConfig.eventTarget.fire({
type: "onPriceReady",
price: a[0]
})
}
该函数最终会通过 $('#summary-price .p-price')
找到显示价格的那个DOM节点, 然后把价格填上去.