同样的代码,在Atom中运行正常,在Pycharm中运行报错.
Python版本:Python3.5
具体代码如下:
from bs4 import BeautifulSoup
html_file = '/Users/yu7eng/Desktop/1_2_homework_required/index.html'
with open(html_file, 'r') as web_data:
soup = BeautifulSoup(web_data, 'lxml')
titles = soup.select('body > div > div > div.col-md-9 > div > div > div > div.caption > h4 > a')
print(titles)
PyCharm环境:
PyCharm执行结果:
在Atom执行结果:
UnicodeDecodeError: 'ascii' codec can't decode byte 0x...
编码异常,你可以看到Atom
右下角的编码是UTF-8
,Pycharm
你的截图里看不到,也在右下角,你可以看下,是不是UTF-8
。
如果不是的话,打开File -> Settings -> Editor -> File Encodings
里修改:
最后建议,Python文件的第一行都写成#coding=utf-8
,能省不少麻烦,尤其涉及中文的时候。
配置没导过来吧
问题解决,添加encoding='utf-8'
后正常,具体如下:
from bs4 import BeautifulSoup
html_file = '/Users/yu7eng/Desktop/1_2_homework_required/index.html'
with open(html_file, 'r', encoding='utf-8') as web_data:
soup = BeautifulSoup(web_data, 'lxml')
titles = soup.select('body > div > div > div.col-md-9 > div > div > div > div.caption > h4 > a')
print(titles)