大家好,
小弟我最近在
http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw
里面捞资料,主要是希望能将资料下载下来并且作整理,而我在抓资料时(假如是1240)用firefox去看header时结果如下
http://www.tpex.org.tw/web/emergingstock/single_historical/download.php
Host: www.tpex.org.tw
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:57.0) Gecko/20100101 Firefox/57.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Referer:
http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw
Content-Type: application/x-www-form-urlencoded
Content-Length: 84
Cookie: _ga=GA1.3.582781261.1509173813; _gid=GA1.3.454443446.1513917119;
_gat=1
Connection: keep-alive
Upgrade-Insecure-Requests: 1
year=106&month=12&stkno=1240&stkname=茂生农经&lang=zh-tw
最后一行看起来无法用header的指令正常放进header里面,请问要如何处理?
我的程式码如下(Python 3.5)
#!/usr/bin/env python3
# -*- coding: utf8 -*-
import urllib.request
url="http://www.tpex.org.tw/web/emergingstock/single_historical/download.php"
headers={
"Host":"www.tpex.org.tw",
"User-Agent":"Mozilla/5.0 (Windows NT 6.1; rv:57.0) Gecko/20100101
Firefox/57.0",
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language":"zh-TW,zh;q=0.8,en-US;q=0.5,en;q=0.3",
"Accept-Encoding":"gzip, deflate",
"Referer":"http://www.tpex.org.tw/web/emergingstock/single_historical/history.php?l=zh-tw",
"Content-Type":"application/x-www-form-urlencoded",
"Content-Length":"84",
"Cookie":"_ga=GA1.3.582781261.1509173813; _gid=GA1.3.1976747965.1513496313;
_gat=1",
"Connection":"keep-alive",
"Upgrade-Insecure-Requests":"1",
# "year=106&month=12&stkno=1240&stkname=茂生农经&lang=zh-tw":""
}
req=urllib.request.Request(url,headers=headers)
response=urllib.request.urlopen(req)
print (str(response))
不将最后一行选项写进去,print出来会是
<http.client.HTTPResponse object at 0x02700B10>
网络上找了半天还是没有一个比较好的解法。