首页 > 同样的url图片,单独下和写到循环里下到的不一样

同样的url图片,单独下和写到循环里下到的不一样

python3.4 urllib 与requests都试用了

import requests
import urllib.request as req
# fh=open('1.jpg','wb')
# r=requests.get('http://img4.doubanio.com/view/photo/large/public/p2274339949.jpg')
# fh.write(r.content)

r0=req.urlopen('http://fmn.xnpic.com/fmn072/20151116/2150/large_uR0J_01c00001ae451e80.jpg')
fh=open('0.jpg','wb')
r0ct=r0.read()
fh.write(r0ct)

得到的都是90多kb的图片
可是如果

http://fmn.xnpic.com/fmn072/20151116/2150/large_uR0J_01c00001ae451e80.jpg
http://fmn.rrimg.com/fmn074/20151116/2150/large_E3iX_42840007b9d91e83.jpg
http://fmn.xnpic.com/fmn071/20151113/2115/large_TAjO_000900002a531e83.jpg
http://fmn.rrimg.com/fmn074/20151113/2110/large_lzFN_01d2000157f91e80.jpg
http://fmn.rrimg.com/fmn076/20151113/2110/large_ey8q_429c000763d01e83.jpg
http://fmn.rrimg.com/fmn074/20151112/1125/large_kvEe_019600012cec1e80.jpg
http://fmn.rrimg.com/fmn076/20151112/1125/large_8T3e_f2e0000307fc1e83.jpg
http://fmn.rrimg.com/fmn075/20151112/1125/large_FwMZ_5a8a000738481e83.jpg
http://fmn.rrimg.com/fmn076/20151111/1250/large_Cr2Y_83700002e57d1e7f.jpg
http://fmn.rrfmn.com/fmn079/20151104/2235/large_uieJ_427e000665001e83.jpg
http://fmn.rrfmn.com/fmn070/20151104/2235/large_9szi_01c0000059fa1e80.jpg
http://fmn.xnpic.com/fmn071/20151102/2145/large_uRTW_01a200001f6e1e80.jpg
http://fmn.rrfmn.com/fmn079/20151111/1245/large_NOnh_5de40002ee321e84.jpg
http://fmn.rrfmn.com/fmn079/20151111/1245/large_wC5M_27e6000702211e7f.jpg
http://fmn.xnpic.com/fmn072/20151111/1245/large_Gx9M_01750001130c1e80.jpg
http://fmn.rrimg.com/fmn075/20151105/1555/large_J10e_f2e00002477f1e83.jpg
http://fmn.rrfmn.com/fmn079/20151102/2140/large_sDSs_466c0005d3d51e7f.jpg
http://fmn.rrimg.com/fmn073/20151102/2140/large_eTzT_84c4000629df1e84.jpg
http://fmn.rrimg.com/fmn074/20151030/1205/large_G9t5_74780005c4921e80.jpg
http://fmn.rrfmn.com/fmn078/20151030/1205/large_uply_8485000190881e7f.jpg
http://fmn.rrfmn.com/fmn070/20151021/2250/large_HJNg_5bcf0000a0831e84.jpg
http://fmn.rrimg.com/fmn073/20151030/1205/large_dc4v_5e4800056f2c1e84.jpg
http://fmn.xnpic.com/fmn072/20151102/2140/large_yfmZ_84f400062a961e84.jpg
http://fmn.rrimg.com/fmn075/20151030/1205/large_ixFg_42540005c5e31e83.jpg
http://fmn.xnpic.com/fmn072/20151029/2310/large_cNey_5bc9000188141e84.jpg
http://fmn.xnpic.com/fmn072/20151029/2310/large_w6EL_f0a70001884b1e83.jpg
http://fmn.xnpic.com/fmn072/20151029/2310/large_ehIE_74c60005b8311e80.jpg
http://fmn.xnpic.com/fmn072/20151029/2310/large_mQgB_278000059ff01e7f.jpg
http://fmn.rrimg.com/fmn075/20151029/2310/large_MRFI_5c61000187ce1e84.jpg
http://fmn.rrfmn.com/fmn070/20151029/2310/large_zXjK_422d000594d51e83.jpg
http://fmn.rrfmn.com/fmn079/20151029/2310/large_Ok1m_51e100057f451e80.jpg

等等这些用循环写就只拿到了10几kb的。。

for i in range(len(rs1)):
    fh=open(str(i)+'.jpg','wb')
    r=requests.get(rs1[i])
    fh.write(r.content)

求问为毛。。。
是因为人人的原因嘛?

PS。为毛这些都这么慢呢。。一张图10s左右 我用chrome不用缓存基本按我的网络状况也是秒开啊


更新:
感谢一楼提醒。。回去检查了一下。。真是拿到的缩略图

pattern1=r'p/\w+?_' #用于替换的pattern
p1cp=re.compile(pattern1)
for url in rs1: 
    url=p1cp.sub('',url)
    print(url)
print(rs1)

这俩print出来的竟然不一样。。好的现在问题又来了= =
啊已有类似问题http://.com/q/1010000000151942
但是我还是想知道python里的for 循环到底是怎么实现的。。。


我是题主
也是X了狗了
因为自己在

for url in rs1: 
    url=p1cp.sub('',url)
    print(url)
print(rs1)

中对循环变量做了改变,所以导致rs1中的url还是未处理的。。缩略图url
也是对python 中的for循环有了更深的了解吧
另:下次提问题一定要把代码贴全啊你个傻X!

感谢各位的认真回答~
自己的第一次提问就能得到这么对认真的回答也是开心呐


不知道是不是对于流的操作出了问题。有肯能是没有将流处理完,又开始下一个流的操作了。


不是下载了缩略图了吧


只拿到了10几kb的

这句话什么意思? 文件大小和真实大小不一样?

我才用requests写了一下,下载下载的文件没有问题,代码如下

import requests

images = [
    'http://fmn.xnpic.com/fmn072/20151116/2150/large_uR0J_01c00001ae451e80.jpg',
    'http://fmn.rrimg.com/fmn074/20151116/2150/large_E3iX_42840007b9d91e83.jpg',
    'http://fmn.xnpic.com/fmn071/20151113/2115/large_TAjO_000900002a531e83.jpg',
    'http://fmn.rrimg.com/fmn074/20151113/2110/large_lzFN_01d2000157f91e80.jpg',
    'http://fmn.rrimg.com/fmn076/20151113/2110/large_ey8q_429c000763d01e83.jpg',
    'http://fmn.rrimg.com/fmn074/20151112/1125/large_kvEe_019600012cec1e80.jpg',
    'http://fmn.rrimg.com/fmn076/20151112/1125/large_8T3e_f2e0000307fc1e83.jpg',
    'http://fmn.rrimg.com/fmn075/20151112/1125/large_FwMZ_5a8a000738481e83.jpg',
    'http://fmn.rrimg.com/fmn076/20151111/1250/large_Cr2Y_83700002e57d1e7f.jpg',
    'http://fmn.rrfmn.com/fmn079/20151104/2235/large_uieJ_427e000665001e83.jpg',
    'http://fmn.rrfmn.com/fmn070/20151104/2235/large_9szi_01c0000059fa1e80.jpg',
    'http://fmn.xnpic.com/fmn071/20151102/2145/large_uRTW_01a200001f6e1e80.jpg',
    'http://fmn.rrfmn.com/fmn079/20151111/1245/large_NOnh_5de40002ee321e84.jpg',
    'http://fmn.rrfmn.com/fmn079/20151111/1245/large_wC5M_27e6000702211e7f.jpg',
    'http://fmn.xnpic.com/fmn072/20151111/1245/large_Gx9M_01750001130c1e80.jpg',
    'http://fmn.rrimg.com/fmn075/20151105/1555/large_J10e_f2e00002477f1e83.jpg',
    'http://fmn.rrfmn.com/fmn079/20151102/2140/large_sDSs_466c0005d3d51e7f.jpg',
    'http://fmn.rrimg.com/fmn073/20151102/2140/large_eTzT_84c4000629df1e84.jpg',
    'http://fmn.rrimg.com/fmn074/20151030/1205/large_G9t5_74780005c4921e80.jpg',
    'http://fmn.rrfmn.com/fmn078/20151030/1205/large_uply_8485000190881e7f.jpg',
    'http://fmn.rrfmn.com/fmn070/20151021/2250/large_HJNg_5bcf0000a0831e84.jpg',
    'http://fmn.rrimg.com/fmn073/20151030/1205/large_dc4v_5e4800056f2c1e84.jpg',
    'http://fmn.xnpic.com/fmn072/20151102/2140/large_yfmZ_84f400062a961e84.jpg',
    'http://fmn.rrimg.com/fmn075/20151030/1205/large_ixFg_42540005c5e31e83.jpg',
    'http://fmn.xnpic.com/fmn072/20151029/2310/large_cNey_5bc9000188141e84.jpg',
    'http://fmn.xnpic.com/fmn072/20151029/2310/large_w6EL_f0a70001884b1e83.jpg',
    'http://fmn.xnpic.com/fmn072/20151029/2310/large_ehIE_74c60005b8311e80.jpg',
    'http://fmn.xnpic.com/fmn072/20151029/2310/large_mQgB_278000059ff01e7f.jpg',
    'http://fmn.rrimg.com/fmn075/20151029/2310/large_MRFI_5c61000187ce1e84.jpg',
    'http://fmn.rrfmn.com/fmn070/20151029/2310/large_zXjK_422d000594d51e83.jpg',
    'http://fmn.rrfmn.com/fmn079/20151029/2310/large_Ok1m_51e100057f451e80.jpg'
]

for i in images:
    fh = open(i.split('/')[-1], 'wb')
    r = requests.get(i)
    fh.write(r.content)

另外,requests是blocking的,最好用asyncio来做,看例子

import aiohttp
import asyncio

images = [
    'http://fmn.xnpic.com/fmn072/20151116/2150/large_uR0J_01c00001ae451e80.jpg',
    'http://fmn.rrimg.com/fmn074/20151116/2150/large_E3iX_42840007b9d91e83.jpg',
    'http://fmn.xnpic.com/fmn071/20151113/2115/large_TAjO_000900002a531e83.jpg',
    'http://fmn.rrimg.com/fmn074/20151113/2110/large_lzFN_01d2000157f91e80.jpg',
    'http://fmn.rrimg.com/fmn076/20151113/2110/large_ey8q_429c000763d01e83.jpg',
    'http://fmn.rrimg.com/fmn074/20151112/1125/large_kvEe_019600012cec1e80.jpg',
    'http://fmn.rrimg.com/fmn076/20151112/1125/large_8T3e_f2e0000307fc1e83.jpg',
    'http://fmn.rrimg.com/fmn075/20151112/1125/large_FwMZ_5a8a000738481e83.jpg',
    'http://fmn.rrimg.com/fmn076/20151111/1250/large_Cr2Y_83700002e57d1e7f.jpg',
    'http://fmn.rrfmn.com/fmn079/20151104/2235/large_uieJ_427e000665001e83.jpg',
    'http://fmn.rrfmn.com/fmn070/20151104/2235/large_9szi_01c0000059fa1e80.jpg',
    'http://fmn.xnpic.com/fmn071/20151102/2145/large_uRTW_01a200001f6e1e80.jpg',
    'http://fmn.rrfmn.com/fmn079/20151111/1245/large_NOnh_5de40002ee321e84.jpg',
    'http://fmn.rrfmn.com/fmn079/20151111/1245/large_wC5M_27e6000702211e7f.jpg',
    'http://fmn.xnpic.com/fmn072/20151111/1245/large_Gx9M_01750001130c1e80.jpg',
    'http://fmn.rrimg.com/fmn075/20151105/1555/large_J10e_f2e00002477f1e83.jpg',
    'http://fmn.rrfmn.com/fmn079/20151102/2140/large_sDSs_466c0005d3d51e7f.jpg',
    'http://fmn.rrimg.com/fmn073/20151102/2140/large_eTzT_84c4000629df1e84.jpg',
    'http://fmn.rrimg.com/fmn074/20151030/1205/large_G9t5_74780005c4921e80.jpg',
    'http://fmn.rrfmn.com/fmn078/20151030/1205/large_uply_8485000190881e7f.jpg',
    'http://fmn.rrfmn.com/fmn070/20151021/2250/large_HJNg_5bcf0000a0831e84.jpg',
    'http://fmn.rrimg.com/fmn073/20151030/1205/large_dc4v_5e4800056f2c1e84.jpg',
    'http://fmn.xnpic.com/fmn072/20151102/2140/large_yfmZ_84f400062a961e84.jpg',
    'http://fmn.rrimg.com/fmn075/20151030/1205/large_ixFg_42540005c5e31e83.jpg',
    'http://fmn.xnpic.com/fmn072/20151029/2310/large_cNey_5bc9000188141e84.jpg',
    'http://fmn.xnpic.com/fmn072/20151029/2310/large_w6EL_f0a70001884b1e83.jpg',
    'http://fmn.xnpic.com/fmn072/20151029/2310/large_ehIE_74c60005b8311e80.jpg',
    'http://fmn.xnpic.com/fmn072/20151029/2310/large_mQgB_278000059ff01e7f.jpg',
    'http://fmn.rrimg.com/fmn075/20151029/2310/large_MRFI_5c61000187ce1e84.jpg',
    'http://fmn.rrfmn.com/fmn070/20151029/2310/large_zXjK_422d000594d51e83.jpg',
    'http://fmn.rrfmn.com/fmn079/20151029/2310/large_Ok1m_51e100057f451e80.jpg'
]

async def write_to_file(url):
    r = await aiohttp.get(url)
    f = open(url.split('/')[-1], 'wb')
    f.write(await r.read())
    r.close()
    f.close()


tasks = []
for url in images:
    tasks.append(asyncio.ensure_future(write_to_file(url)))


if __name__ == '__main__':
    loop = asyncio.get_event_loop()
    loop.run_until_complete(asyncio.wait(tasks))
    loop.close()

这个会快很多,虽然文件读写是blocking的,但是网络方面是异步了。(Python 3.5)

【热门文章】
【热门文章】