[问题] python爬虫问题 shot0512 PTT批踢踢实业坊

[问题] python爬虫问题

楼主: shot0512 (诚实豆沙包) 2020-07-23 17:45:44

小弟是爬虫菜鸟新手
最近在学习如何爬虫
从最基本的静态网页开始爬起
以下是我的CODE
import requests
from bs4 import BeautifulSoup
import time
url = "http://www.eslite.com/Search_BW.aspx?query=python&searchType=&page=1"
html = requests.get(url).text
soup = BeautifulSoup(html, 'html.parser') #先输入的是要解析的文件名称后面是
parser
page = 1
all_titles=[]
def parse(html, page):
print(page)
all_td_tags = soup.find_all('td', class_="name")
for item in all_td_tags:
title=item.a.span.text.strip()
all_titles.append(title)
next_page_node = soup.find('a',
id="ctl00_ContentPlaceHolder1_pager1_next") #下一页的node
print (next_page_node.get('href'))
print("

作者: TakiDog (多奇狗) 2020-07-23 18:37:00

你的request只产生了一次，parse一直执行同一个资料

楼主: shot0512 (诚实豆沙包) 2020-07-23 19:24:00

next_html = requests.get(next_url).text但这里不是已经有再去request了吗？我知道发生什么事了感谢大大

作者: a28503662 (Ok Rocker) 2020-08-12 13:05:00

应该要再给soup解析一次吧～

继续阅读

[问题] selenium问题shinle14 [问题] pyqt 鼠标点击事件创造按钮 (已解决)znmkhxrw [问题] list与dict的混用cococrisp7 [问题] 关于绝对路径。ides13 [问题] CKIPtagger 套件环境s878530 [问题] guizero/RPI 无法输入中文HuangJC [问题] Matplotlibjason60602 [问题] 范围当作dictionary keyrtt2008 [问题] json多层解析问题love11098787 [问题] dataframe 里面作分类theusa