Python3中urllib构建请求头
1、打开python开发工具IDLE,新建‘head.py’文件,并编写代码如下:import urllib.requestheader={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.90 Safari/537.36 2345Explorer/9.5.2.18321'}url = 'http://www.baidu.com'request = urllib.request.Request(url=url,headers=header)resp =urllib.request.urlopen(request)print (resp.getcode())这个User-Agent是复制的浏览器的,F12开发者模式,network页签,输入网站,点击name对应的url就在右侧能看到header信息,当然也可以直接网上找一个。

3、请求头添加Referer,代码如下:import urllib.requestheader={ 'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.90 Safari/537.36 2345Explorer/9.5.2.18321' ,'Referer':'http://www.google.com'}url = 'http://www.baidu.com/'request = urllib.request.Request(url=url,headers=header)resp =urllib.request.urlopen(request)print (resp.geturl())

5、设置随机的User-Agent,首先搜集一些浏览器的User-Agent:"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1", "Mozilla/5.0 (Windows NT 6.1; rv2.0.1) Gecko/20100101 Firefox/4.0.1", "Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11", "Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11"

7、F5运行程序,这样每次请求的User-Agent都不一定相同,能一定程度上对付反爬虫网站。
