对比一下pycurl和requests在下载大文件时的写法,以及性能对比。
pycurl的写法:
def download_pycurl(url, filename):
with open(filename, "wb") as fp:
curl = pycurl.Curl()
curl.setopt(pycurl.URL, url)
curl.setopt(pycurl.USERAGENT, 'Mozilla/5.0')
curl.setopt(pycurl.WRITEDATA, fp)
curl.perform()
curl.close()
return filename
特别注意:此处传入的url需要自行urlencode,pycurl不会帮你编码的。
requests的写法:
def download_requests(url, filename):
with requests.get(url, stream=True) as r:
r.raise_for_status()
with open(filename, 'wb') as f:
for chunk in r.iter_content(chunk_size=1024 * 32):
f.write(chunk)
return filename
此处传的url,requests会自动帮你编码的,比如:带空格的URL可以直接传。
性能方便,其实相差不太大,从s3上下载一个370MB的文件,统计耗时如下:
pycurl: 48.79s
requests: 50.23s