Re: [爆卦] 历史线上报纸 taxi550 PTT批踢踢实业坊

Re: [爆卦] 历史线上报纸

楼主: taxi550 (小姐到哪) 2015-01-05 13:05:00

python 版本，需要 PIL 跟 pycurl ,将就用吧..
#!coding=utf-8
import os
import time
import math
import pycurl
import operator
from PIL import Image
from StringIO import StringIO
def main():
picUrl = r"https://event.franklin.com.tw/C2014_11_TGF/showimg.aspx?date="
path = os.path.normpath(os.path.dirname(__file__)+"/pic")
noPic = Image.open(os.path.join(os.path.dirname(__file__), "no.jpg"))
noH = noPic.histogram()
if os.path.isdir(path):
pass
else:
os.makedirs(path)
print "目录 "+path+" 不存在,产生新目录."
print "图片将储存于 "+path+" 目录."
for y in xrange(1951, 2015):
y = str(y)
for m in xrange(1, 12):
if m < 10:
#1951年9月16日前资料不存在所以跳过,写法不是很好，将就一下
if m < 9 and y == "1951":
continue
m = "0"+str(m)
else:
m = str(m)
print "开始撷取 "+y+"年"+m+"月."
for d in xrange(1, 31):
if d < 10:
date = y+m+"0"+str(d)
else:
date = y+m+str(d)
savefile = os.path.normpath(path+"/"+date+".jpg")
#print savefile
#图片已存在或已下载就略过
if os.path.isfile(savefile):
print savefile+"已存在."
continue
else:
#尝试取得图片
try:
buffer = StringIO()
c = pycurl.Curl()
c.setopt(c.URL, picUrl+date)
c.setopt(c.WRITEFUNCTION, buffer.write)
c.perform()
c.close()
except:
#取得图片失败
continue
else:
try:
buffer.seek(0)
im = Image.open(buffer)
imH = im.histogram()
#比对图片,数字越大说明相差的越大,相似度100%接近860
rms = math.sqrt(reduce(operator.add,
list(map(lambda a,b:(a-b)**2, noH, imH)))/len(noH)) )
#print rms
if rms > 870:
#图片不相同
im.save(savefile, 'JPEG')
else:
#图片比对相同就跳过
continue
except:
continue
else:
#图片储存成功
print time.strftime("%Y-%m-%d %H:%M:%S",
time.localtime())+" 储存 "+savefile+" 成功."
if __name__ == '__main__':
main()

作者: WeasoN (WeasoN) 2015-01-05 13:05:00

跪求翻译

作者: kuninaka 2015-01-05 13:05:00

60分

作者: steward135 (é€†é¢¨é«˜é£›) 2015-01-05 13:05:00

天书

作者: slent67 (史兰特67) 2015-01-05 13:06:00

XDDDD

作者: mobile02 (马英九ダイサイ) 2015-01-05 13:07:00

感谢大大分享可以买到票了

作者: zxc17893 (嘻嘻) 2015-01-05 13:07:00

给我翻译翻译

作者: LIONDODO (LION) 2015-01-05 13:08:00

不要砍人家站啦xd...

作者: kuninaka 2015-01-05 13:08:00

这个程式码是消除犯罪前科猫女想要的那只程式

作者: psinqoo (é›¶åº¦ç©ºé–“) 2015-01-05 13:10:00

爬虫程式?

作者: CP64 ((￣▽￣＃)﹏﹏) 2015-01-05 13:14:00

只是随选抓图的 script ' ~')

作者: ming1053 (ming) 2015-01-05 13:22:00

太长失败

作者: asd2260123 (å—éƒ¨å¤§è‘‰æ–‡çµ„å¤œæ ¡è‚¥å®…) 2015-01-05 13:36:00

推个python

作者: rs6000 (正义的胖虎) 2015-01-05 13:51:00

感谢大大热心的分享

继续阅读

[问卦] 如果开放大逃杀抢票Gangnam5566 [问卦] 物理数学哪本才是经典??MathforPhy Re: [问卦] 有没有现在全台湾超商都孝子的八卦？changatcmu Re: [新闻] 元旦连假大塞车叶匡时：没4天连假经验deenband [新闻] 高铁不破产，你每年都要送它2000元？James1114 Re: [新闻] 元旦连假大塞车叶匡时：没4天连假经验Gjoy Re: [问卦] 国民党不倒，台湾不会好布条fricca [问卦] 有没有大姐的八卦dorst010142 [问卦] 有没有 Google不开售票系统的八卦steward135 Re: [新闻] 元旦连假大塞车叶匡时：没4天连假经验ECZEMA