快捷導(dǎo)航

使用Python畫股票的K線圖的方法步驟

更新時(shí)間：2019年06月28日 10:01:23 作者：allenmagic

這篇文章主要介紹了使用Python畫股票的K線圖的方法步驟，文中通過示例代碼介紹的非常詳細(xì)，對大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值，需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧

導(dǎo)言

本文簡單介紹了如何從網(wǎng)易財(cái)經(jīng)獲取某支股票的價(jià)格數(shù)據(jù)，并根據(jù)價(jià)格數(shù)據(jù)畫出相應(yīng)的日K線圖。有助于新手了解并使用Python的相關(guān)功能。包括列表、自定義函數(shù)、for循環(huán)、if函數(shù)以及如何使用matplotlib進(jìn)行作圖等內(nèi)容。

第一步：從網(wǎng)易財(cái)經(jīng)獲取股票的價(jià)格數(shù)據(jù)

我一般是在網(wǎng)易財(cái)經(jīng)查看某支股票的價(jià)格和成交數(shù)據(jù)，網(wǎng)易財(cái)經(jīng)可以查到任意滬深的股票，我們使用招商銀行的數(shù)據(jù)作為參考。

1、構(gòu)建爬蟲獲取股票價(jià)格數(shù)據(jù)

這里不對Python做介紹了，如果需要了解什么是Python，可以自行百度或者訪問Python官網(wǎng).

加載需要的模塊

代碼如下：

import re,urllib2,time,csv,datetime
import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.finance as mpf
import matplotlib.dates as mpd

其中urllib2是用來解析HTML內(nèi)容的包，主要是從url獲取網(wǎng)頁內(nèi)容；re是正則表達(dá)式包，本文會使用正則表達(dá)式來從抓取的網(wǎng)頁數(shù)據(jù)中獲取到有用的數(shù)據(jù)；time和datetime是時(shí)間相關(guān)的包，主要用來設(shè)定要抓取的時(shí)間以及其它相關(guān)時(shí)間的處理；csv包是用來生成csv數(shù)據(jù)（該數(shù)據(jù)會被用于R來畫K線圖），其余的幾個(gè)包會在使用時(shí)單獨(dú)介紹，你也可以在需要的時(shí)候在程序頭部補(bǔ)充import。

設(shè)定時(shí)間相關(guān)

代碼如下：

t = time.localtime() # 獲取當(dāng)前的本地時(shí)間
year = range(t[0],1989,-1) # 設(shè)定年度范圍，從當(dāng)前年度至滬市開市的年份倒序生成
season = range(4,0,-1)  # 生成季度的數(shù)據(jù)列表，從4季度到1季度倒序生成

為什么要這么設(shè)定時(shí)間呢？仔細(xì)的查看網(wǎng)易股票數(shù)據(jù)的url，是按照年度和季度來構(gòu)成的，我們發(fā)現(xiàn)搜索數(shù)據(jù)也是用年度和季度來搜索的。

招商銀行2017年1季度數(shù)據(jù)

其url構(gòu)成如下：http://quotes.money.163.com/trade/lsjysj_600036.html?year=2017&season=1可見可拆為6個(gè)子字符串，分別是http://quotes.money.163.com/trade/lsjysj_、600036、.html?year=、2017、&season=、1。其中第2、4、6個(gè)子串可以參數(shù)化輸入獲取特定需求的數(shù)據(jù)。

定義獲取數(shù)據(jù)的函數(shù)

代碼如下：

def getData(url):
  request = urllib2.Request(url)
  response = urllib2.urlopen(request)
  content = response.read()

  pattern = re.compile('</thead[\s\S]*</tr>  </table>')
  ta = re.findall(pattern, str(content))
  pattern1 = re.compile("<td class='cGreen'>")
  pattern2 = re.compile("<td class='cRed'>")
  pattern3 = re.compile(",")
  tab1 = re.sub(pattern1,"<td>",str(ta))
  tab2 = re.sub(pattern2,"<td>",str(tab1))
  tab = re.sub(pattern3, "", str(tab2))

  if len(tab) == 0:
    data = []
  else:
    pattern3 = re.compile('<td>(.*?)</td>')
    data = re.findall(pattern3, str(tab))

  for d in data:
    if d == '':
      data.remove('')

  return data

本段代碼定義個(gè)一個(gè)函數(shù)getDate(url)，函數(shù)名為getData，參數(shù)為url。相當(dāng)于從該url獲取股票的交易數(shù)據(jù)，顯然這個(gè)函數(shù)是定制的。

首先，我們用urllib2模塊的相關(guān)函數(shù)解析并獲取網(wǎng)頁的數(shù)據(jù)。第二步，使用re模塊的數(shù)據(jù)對抓取的網(wǎng)頁內(nèi)容進(jìn)行初步的處理，分為了三個(gè)過程

首先匹配"</thead[\s\S]*</tr> </table>"之間的內(nèi)容并返回，因?yàn)樵谶@之間的內(nèi)容包含了所有需要的數(shù)據(jù)，這是一個(gè)簡單的正則表達(dá)式，表示返回</thead和</tr> </table>兩個(gè)字符串之間的所有內(nèi)容
匹配<td class='cGreen'>、<td class='cRed'>并使用<td>替換，因?yàn)檫@兩個(gè)字符串會影響后續(xù)的匹配數(shù)據(jù)，現(xiàn)行替換掉可以更方便的匹配到需要的數(shù)據(jù)
替換到千分位","號，因?yàn)镻ython和R并不會識別有千分位號的數(shù)據(jù)，所以我們要將數(shù)據(jù)轉(zhuǎn)換為非千分位的數(shù)據(jù)。
tab是按照要求最后獲取的包含數(shù)據(jù)和文本的原始內(nèi)容
用if函數(shù)來獲取除文本的數(shù)據(jù)，因?yàn)槿绻鹹ear和season超過了當(dāng)前的界限，會返回空的tab，所以我們在這里進(jìn)行判斷，如果少了這個(gè)判斷，會報(bào)出index error。這個(gè)if函數(shù)表示了如果tab為空，data也是個(gè)空的列表，如果tab不為空，那么根據(jù)pattern3返回需要的數(shù)據(jù)至data列表
用一個(gè)for循環(huán)來遍歷data列表，刪除空白的內(nèi)容（其實(shí)這一步不需要，因?yàn)樵趇f中已經(jīng)剔除了空的內(nèi)容。

所以定義了以上的函數(shù)后，就可以使用該函數(shù)返回特定url的數(shù)據(jù)。

獲取某支股票的數(shù)據(jù)

代碼如下：

def get_stock_price(code):
  url1 = "http://quotes.money.163.com/trade/lsjysj_"
  url2 = ".html?year="
  url3 = "&season="
  urllist = []
  for k in year:
    for v in season:
      urllist.append(url1+str(code)+url2+str(k)+url3+str(v))
  
  price = []
  for url in urllist:
    price.extend(getData(url))
  return price

自定義get_stock_price(code)函數(shù)，code是指股票代碼，使用該函數(shù)可以返回該股票所有的歷史數(shù)據(jù)（OHLC以及其它）思路很簡單：

根據(jù)code構(gòu)建其股票數(shù)據(jù)的頁面的url列表
使用getData(url）函數(shù)和for循環(huán)，返回所有的歷史數(shù)據(jù)

最終返回的是price的數(shù)據(jù)列表

這樣，我們就可以使用該函數(shù)獲取某支股票的所有歷史數(shù)據(jù)：

# get all histrocial data include all price and others
price = get_stock_price(600036)

獲取招商銀行（600036）的所有歷史數(shù)據(jù)。

2、保存數(shù)據(jù)

保存為csv文件

代碼如下：

writer = csv.writer(file("stock.csv",'wb'))
writer.writerow(['Date','Open','High','Low','Close','Volume'])
pr = []
for i in range(0,len(price),11):
  pr.extend([[price[i],price[i+1],price[i+2],price[i+3],price[i+4],price[i+8]]])

for prl in pr:
  writer.writerow(prl)

我們使用csv模塊保存數(shù)據(jù)為csv文件，用于在R中讀取并作圖，我們查看在網(wǎng)易的數(shù)據(jù)展示可以發(fā)現(xiàn)，總共11個(gè)字段，所有我們在每11個(gè)切片中，返回時(shí)間、OHLC（開盤價(jià)、最高價(jià)、最低價(jià)、收盤價(jià)）和交易量的數(shù)據(jù)并保存為csv的文件格式。

處理保存數(shù)據(jù)到列表

代碼如下：

# get the number for date by date2num
def Date_no(strdate):
  t = time.strptime(strdate, "%Y-%m-%d")
  y,m,d = t[0:3]
  d = datetime.date(y, m, d)
  n = mpd.date2num(d)

  return n

# get the price data 
pr = []
for i in range(0,len(price),11):
  pr.extend([[
    Date_no(price[i])
    ,float(price[i+1])
    ,float(price[i+2])
    ,float(price[i+3])
    ,float(price[i+4])
    ,float(price[i+8])]]
    )

這個(gè)程序片段是用來處理和保存數(shù)據(jù)用于在pyhton中做出K線圖。

定義函數(shù)將字符串的時(shí)間處理為matplotlib中作圖使用的數(shù)值（直接獲取的數(shù)據(jù)中時(shí)間是字符串）
返回返回時(shí)間、OHLC（開盤價(jià)、最高價(jià)、最低價(jià)、收盤價(jià)）和交易量的數(shù)據(jù)并存儲在pr這個(gè)列表里

第二步：做出K線圖

在R中作圖

代碼如下：

library(quantmod)

rm(list = ls())
setwd("~/GitHub/index/")
price <- as.xts(read.zoo("stock.csv",header=TRUE,sep=",",colClasses = c("Date", rep("numeric",5))))

n <- nrow(price)
m <- nrow(price)-100

#pdf(file = "k.pdf")
chartSeries(price[c(m:n)],theme = chartTheme("white"),up.col = "red",dn.col = "green",name = "600036",time.scale = 0.5,line.type = "l",bar.type = "ohlc",major.ticks='auto', minor.ticks=TRUE)
#dev.off()

做出的圖片效果如下：

R中可以使用quantmod包中的chartSeries函數(shù)畫出K線圖，具體的使用方法可以參考chartSeries參考文檔

在Python中使用matplotlib作圖

代碼如下：

quotes = pr[0:80]

print(quotes)

fig,ax = plt.subplots(figsize=(30,6))
fig.subplots_adjust(bottom=0.2)
mpf.candlestick_ohlc(ax,quotes,width=0.4,colorup='r',colordown='g')
plt.grid(False)
ax.xaxis_date()
ax.autoscale_view()
plt.setp(plt.gca().get_xticklabels(), rotation=30) 
plt.show()

K線效果圖如下：

使用matplotlib的candlestick_ohlc的參考文檔,但是目前有一些問題，比如會將非交易日期也置放在x軸，會到至K線出現(xiàn)斷裂，等待下一步的解決方法吧。

相關(guān)的代碼已經(jīng)同步到最大的同性交友網(wǎng)站我的Github上了，可以參考，其中stock.py是主要程序。

寫在最后：因?yàn)槲矣薪?年沒使用過python了，所有代碼可能不太簡練。我也旨在解決問題，當(dāng)然解決問題的方法千萬種，比如這個(gè)例子，最直接的辦法就是使用網(wǎng)易的下載所有（或者特定時(shí)間段）的數(shù)據(jù)為csv格式，然后用Excel畫K線也可以的。

以上就是本文的全部內(nèi)容，希望對大家的學(xué)習(xí)有所幫助，也希望大家多多支持腳本之家。

您可能感興趣的文章: