【爬虫3】公粮的诱惑

明天开学,爷假结!!!!!!!!!!!!! 祝我开学好运(要吃席了)

人品问题

爬有些网站时会无法获取完整的html(获取的html就不展示了,自己运行,反正获取的html不全)

from requests import get

url = "http://bbs.qyai.net/doings"
response = get(url).text
print(response)

二般情况下,你百度的话会让你模拟登入去,其实是你人品不好罢了


解决人品问题

在网站直接按F12,开启F12大法

attachments-2021-02-b5knrjwo6038cb3e3d06f.PNG
一般选第一个就行,标头里往下找到Cookie,再进行一顿CV工程

# b就是哪个cookie文件
b = 'remember_web_59ba36addc2b2f9401580f014c7f58ea4e30989d=eyJpdiI6ImdvS3FDVTBSMmhxTERVc0I1RFprT3c9PSIsInZhbHVlIjoieTYza3pzUEhzSHZ6bm5nK05EU2VvQTN3anFUQ0RPN2JiTHVkODdNXC8zbHEwcGVMXC9CXC80NEUzeFNpV3krWlVvN1RnY3R3Q28rNWhFU1NmNGp4NitlNGVvRGdMdWxCWU1XS3ZkQXZOWlwvaFVGSUkzcTA3MmIya2owMlhFcVBWRWhPN0lITHhQblFvd1RhZUM3VXdYbkJyako0VWo5d25TVmdUSWg3UUxFbFhSWT0iLCJtYWMiOiJhYWVmNTdmZTY4YmQ3NmUzMzZhODY5YTQ4OTFjY2ZkYzE1OGQ2MzJmODMxODVjMjgxYWRjMGE2MTQ0YjNlZmVhIn0%3D; XSRF-TOKEN=eyJpdiI6IlR1bVlyOFlZNXV0MGh5VzlFMkNMaHc9PSIsInZhbHVlIjoiWFwvSklhbUVtMWVCQ3lDQ0ZoR05WZkNMdjVcL0pNMWQzYStJS2Jaa0p0N1VhMHpBaU1qdzJQYmd2cWZqM050RjI0IiwibWFjIjoiZGYzODU0ZTBiMzU1ZGRiNDA2ZWY0MzA5ZWE1N2I2OTY4Y2E1N2RmZTMyZjc5Njg2NTc3YjRmZTRkNzkzNTExOCJ9; tipask_session=eyJpdiI6ImpLTGdMVVp1UktXcFdpckswb0ZKcmc9PSIsInZhbHVlIjoiVFozaHgyVHpkOXV2Qk5DVExodElPXC91TVlXRmhMNTdqY3NPck9XRWwwblwvNXpCVWdIWnlGOWRlSU1LbThcLzFqMyIsIm1hYyI6IjVmNDA1NGM5MzQ0M2M3NjJiZjQxMzAxYjY0YWRhMGU2MmNmYjE4OTg5ZjA2NDAyMTk5YzY1MmRlYTQ4MjY5NjcifQ%3D%3D'

# 但是str没有用,要转成dict类型(聪明的人在看咋转的,更聪明的人晓得百度)
cookie ={}
for line in b.split(';'):
    key,value = line.split('=',1)
    cookie[key] = value

response = get(url,cookies=cookie).text
print(response)

通过这个cookie就可以获取完整的html了,让你对网站欲罢不能[手动滑稽]

完整代码

from requests import get

url = "http://bbs.qyai.net/doings"

b = 'remember_web_59ba36addc2b2f9401580f014c7f58ea4e30989d=eyJpdiI6ImdvS3FDVTBSMmhxTERVc0I1RFprT3c9PSIsInZhbHVlIjoieTYza3pzUEhzSHZ6bm5nK05EU2VvQTN3anFUQ0RPN2JiTHVkODdNXC8zbHEwcGVMXC9CXC80NEUzeFNpV3krWlVvN1RnY3R3Q28rNWhFU1NmNGp4NitlNGVvRGdMdWxCWU1XS3ZkQXZOWlwvaFVGSUkzcTA3MmIya2owMlhFcVBWRWhPN0lITHhQblFvd1RhZUM3VXdYbkJyako0VWo5d25TVmdUSWg3UUxFbFhSWT0iLCJtYWMiOiJhYWVmNTdmZTY4YmQ3NmUzMzZhODY5YTQ4OTFjY2ZkYzE1OGQ2MzJmODMxODVjMjgxYWRjMGE2MTQ0YjNlZmVhIn0%3D; XSRF-TOKEN=eyJpdiI6IlR1bVlyOFlZNXV0MGh5VzlFMkNMaHc9PSIsInZhbHVlIjoiWFwvSklhbUVtMWVCQ3lDQ0ZoR05WZkNMdjVcL0pNMWQzYStJS2Jaa0p0N1VhMHpBaU1qdzJQYmd2cWZqM050RjI0IiwibWFjIjoiZGYzODU0ZTBiMzU1ZGRiNDA2ZWY0MzA5ZWE1N2I2OTY4Y2E1N2RmZTMyZjc5Njg2NTc3YjRmZTRkNzkzNTExOCJ9; tipask_session=eyJpdiI6ImpLTGdMVVp1UktXcFdpckswb0ZKcmc9PSIsInZhbHVlIjoiVFozaHgyVHpkOXV2Qk5DVExodElPXC91TVlXRmhMNTdqY3NPck9XRWwwblwvNXpCVWdIWnlGOWRlSU1LbThcLzFqMyIsIm1hYyI6IjVmNDA1NGM5MzQ0M2M3NjJiZjQxMzAxYjY0YWRhMGU2MmNmYjE4OTg5ZjA2NDAyMTk5YzY1MmRlYTQ4MjY5NjcifQ%3D%3D'
cookie ={}
for line in b.split(';'):
    key,value = line.split('=',1)
    cookie[key] = value

# 关键:cookies参数
response = get(url,cookies=cookie).text
print(response)


  • 发表于 2021-02-26 18:27
  • 阅读 ( 411 )

1 条评论

请先 登录 后评论
此心安處是吾鄉
此心安處是吾鄉

4 篇文章

作家榜 »

  1. 阿九 20 文章
  2. q5320 14 文章
  3. 不期而遇 8 文章
  4. admin 7 文章
  5. 此心安處是吾鄉 4 文章
  6. 小白 4 文章
  7. Mr.Pang 3 文章
  8. yixinBC 3 文章