使用 python-requests库抓取百度时添加响应头gzip无效

0 0

使用 python-requests库抓取百度时添加响应头gzip无效

具体是这样的

获得的网页headers如下：


 html


 {'Accept-Encoding': 'gzip, deflate', 'User-Agent': 'python-requests/2.6.1 CPython/3.4.3 Windows/8', 'Connection': 'keep-alive', 'Accept': '*/*'}

因此添加请求内容


 python


 source = requests.get(domain, headers={'Accept-Encoding': 'gzip, deflate'}).text
    html = BeautifulSoup(source, 'lxml')
    picture_url_list = html.find_all('div')
    print(picture_url_list)

结果是显示乱码

但是如果只是添加 'Accept-Encoding': 'deflate' 或者 'Accept-Encoding: '' 结果又是正常的。

content-encoding 确实是 gzip 为什么请求却不行呢？

gzip python3.x python-爬虫

10 years, 6 months ago

miaofun

miaofun 10 years, 6 months ago

使用 python-requests库抓取百度时添加响应头gzip无效

miaofun

Answers

Your Answer