大家好,又上来打扰了,最近在爬虫的时候遇到这个词:
"宏碁电脑",发现爬下来时候会变成乱码,
https://imgur.com/ZSV4gAe
经爬文查询后发现这样的问题:
https://blog.hoamon.info/2008/05/python-big5.html
不过该解法似乎不能应用在python3.7
想问一下有没有类似的情况该怎么解@@?
补上网站:
https://tw.stock.yahoo.com/news/%E5%A4%96%E8%B3%87-%E8%B3%A3%E8%B6%85%E8%82%A1-%E5%AE%8F-%E7%A2%81-%E9%B4%BB-234706227.html
程式码:
import requests
from bs4 import BeautifulSoup
url='https://tw.stock.yahoo.com/news/%E5%A4%96%E8%B3%87-%E8%B3%A3%E8%B6%85%E8%82%A1-%E5%AE%8F-%E7%A2%81-%E9%B4%BB-234706227.html'
req=requests.get(url)
bs=BeautifulSoup(req.text,'html.parser')
print(bs.find('h1').text)