import requests
from bs4 import BeautifulSoup
res=requests.get('https://s.taobao.com/search?
initiative_id=tbindexz_20160515&ie=utf8&spm=a21bo.50862.201856-taobao-item.2&sourceId=tb.index&search_type=item&ssid=s5-e&commend=all&imgfile=&q=python%E4%B9%A6%E7%B1%8D&suggest=0_5&_input_charset=utf-8&wq=python&suggest_query=python&source=suggest')
soup=BeautifulSoup(res.text,'lxml')
for item in soup.select('.item'):
print item.select('strong')
为什么这个爬虫我用pycharm跑总是Process finished with exit code 0
没法得到输出,但程序也没什么错?
import requests
import re
import json
res = requests.get('https://s.taobao.com/search?initiative_id=tbindexz_20160515&ie=utf8&spm=a21bo.50862.201856-taobao'
'-item.2&sourceId=tb.index&search_type=item&ssid=s5-e&commend=all&imgfile=&q=python%E4%B9%A6%E7%B1'
'%8D&suggest=0_5&_input_charset=utf-8&wq=python&suggest_query=python&source=suggest')
rs = re.search(r'g_page_config = (.*?);\n', res.text)
g_page_config = json.loads(rs.group(1))
items = g_page_config['mods']['itemlist']['data']['auctions']
for item in items:
print('-' * 100)
print(item['raw_title'])
print(item['view_price'])
可以用这种方法取出数据,不过感觉怪怪的就是了