python實(shí)現(xiàn)的一個(gè)火車(chē)票轉(zhuǎn)讓信息采集器
更新時(shí)間:2014年07月09日 10:01:26 投稿:junjie
這篇文章主要介紹了python實(shí)現(xiàn)的一個(gè)火車(chē)票轉(zhuǎn)讓信息采集器,采集信息來(lái)源是58同程或者趕集網(wǎng),需要的朋友可以參考下
好吧,我承認(rèn)我是對(duì)晚上看到一張合適的票轉(zhuǎn)讓但打過(guò)電話去說(shuō)已經(jīng)被搞走了這件事情感到蛋疼。直接上文件吧。
#coding: utf-8 ''' 春運(yùn)查詢火車(chē)票轉(zhuǎn)讓信息 Author: piglei2007@gmail.com Date: 2011.01.25 ''' import re import os import time import urlparse import datetime import traceback import urllib2 import socket socket.setdefaulttimeout(20) BLANK_RE = re.compile(r"\s+") opener = urllib2.build_opener(urllib2.HTTPCookieProcessor()) opener.addheaders = [ ("User-agent", "Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.9.1) Gecko/20090704 Firefox/3.5"), ("Accept", "*/*"), ] urllib2.install_opener(opener) from BeautifulSoup import BeautifulSoup SOURCE = { "58": "http://bj.58.com/huochepiao/?Num=%(train)s&StartTime=%(date)s00", "ganji": "http://bj.ganji.com/piao/cc_%(train)s/%(date)s/", } RECORD_FILE = "/tmp/ticket_records.txt" def parse_record(): try: return set([x.strip() for x in open(RECORD_FILE, "r").readlines()]) except IOError: open(RECORD_FILE, "w") return set() def flush_record(records): open(RECORD_FILE, "w").write("\n".join(records)) def main(config): """ 開(kāi)始抓取 """ existed = parse_record() to_email = [] for train in config["trains"]: for date in config["dates"]: for type, _url in SOURCE.items(): url = _url % dict(train=train, date=date) content = urllib2.urlopen(url).read() soup = BeautifulSoup(content) result = parse_content(type, soup, train) for url, text in result: url = urlparse.urljoin(_url, url) # 只要臥鋪! if url not in existed and u"臥" in text: to_email.append([text, url]) existed.add(url) if to_email: content = "".join( [x for x in [" | ".join(y) for y in to_email]] ).encode("utf-8") simple_mail(config["people"], content) flush_record(existed) def parse_content(type, soup, train): """ 獲得車(chē)次信息 """ result = [] if type == "58": info_table = soup.find("table", id="infolist") if info_table: for x in info_table.findAll("tr", text=re.compile(ur"%s(?!時(shí)刻表)" % train, re.I)): a = x.parent _text = BLANK_RE.sub("", a.text) result.append([a["href"], _text]) if type == "ganji": for x in soup.findAll("dl", {"class": "list_piao"}): a = x.dt.a result.append([a["href"], a.text]) return result EMAIL_HOST = 'smtp.sohu.com' EMAIL_HOST_USER = 'yourname@sohu.com' EMAIL_HOST_PASSWORD = 'yourpassword' EMAIL_PORT = 25 def simple_mail(to, content): """ 發(fā)送郵件 """ import smtplib from email.mime.text import MIMEText msgRoot = MIMEText(content, 'html', 'UTF-8') msgRoot['Subject'] = "[%s]有票來(lái)啦!?。?!" % datetime.datetime.today().isoformat(" ") msgRoot['From'] = EMAIL_HOST_USER msgRoot['To'] = ", ".join(to) s = smtplib.SMTP(EMAIL_HOST, EMAIL_PORT) s.login(EMAIL_HOST_USER, EMAIL_HOST_PASSWORD) s.sendmail(EMAIL_HOST_USER, to, msgRoot.as_string()) s.close() def switch_time_zone(): """ 切換時(shí)區(qū) """ os.environ["TZ"] = "Asia/Shanghai" time.tzset() switch_time_zone() if __name__ == '__main__': config = { "trains": ("k471",), "dates": ("20110129",), "people": ( "youremail@sohu.com", ) } try: main(config) print "%s: ok" % datetime.datetime.today() except Exception, e: print traceback.format_exc()
然后放入cron,你懂的。
您可能感興趣的文章:
- Python 12306搶火車(chē)票腳本
- Python 12306搶火車(chē)票腳本 Python京東搶手機(jī)腳本
- Python動(dòng)刷新?lián)?2306火車(chē)票的代碼(附源碼)
- 100行Python代碼實(shí)現(xiàn)自動(dòng)搶火車(chē)票(附源碼)
- 使用Python+Splinter自動(dòng)刷新?lián)?2306火車(chē)票
- python實(shí)現(xiàn)12306火車(chē)票查詢器
- Python腳本實(shí)現(xiàn)12306火車(chē)票查詢系統(tǒng)
- 利用Python實(shí)現(xiàn)命令行版的火車(chē)票查看器
- python實(shí)現(xiàn)2014火車(chē)票查詢代碼分享
- 火車(chē)票搶票python代碼公開(kāi)揭秘!
相關(guān)文章
100?個(gè)?Python?小例子(練習(xí)題四)
這篇文章主要給大家分享100?個(gè)?Python?小例子,前文分享了一二三,本文的四十最后一篇了,這篇就把100道python小練習(xí)全分享完了,感興趣的小伙伴也可以去練習(xí)前幾期內(nèi)容,洗碗給這幾篇文章給你的學(xué)習(xí)帶來(lái)幫助2022-01-01Python中處理無(wú)效數(shù)據(jù)的詳細(xì)教程
無(wú)效數(shù)據(jù)是指不符合數(shù)據(jù)收集目的或數(shù)據(jù)收集標(biāo)準(zhǔn)的數(shù)據(jù),這些數(shù)據(jù)可能來(lái)自于不準(zhǔn)確的測(cè)量、缺失值、錯(cuò)誤標(biāo)注、虛假的數(shù)據(jù)源或其他問(wèn)題,本文就將帶大家學(xué)習(xí)Python中如何處理無(wú)效數(shù)據(jù),感興趣的同學(xué)可以跟著小編一起來(lái)學(xué)習(xí)2023-06-06解決csv.writer寫(xiě)入文件有多余的空行問(wèn)題
今天小編就為大家分享一篇解決csv.writer寫(xiě)入文件有多余的空行問(wèn)題,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過(guò)來(lái)看看吧2018-07-07