欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片

使用python解析xml成對(duì)應(yīng)的html示例分享

 更新時(shí)間:2014年04月02日 11:13:34   作者:  
這篇文章主要介紹了使用python解析xml成對(duì)應(yīng)的html示例,需要的朋友可以參考下

SAX將dd.xml解析成html。當(dāng)然啦,如果得到了xml對(duì)應(yīng)的xsl文件可以直接用libxml2將其轉(zhuǎn)換成html。

復(fù)制代碼 代碼如下:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
#---------------------------------------
#   程序:XML解析器
#   版本:01.0
#   作者:mupeng
#   日期:2013-12-18
#   語言:Python 2.7
#   功能:將xml解析成對(duì)應(yīng)的html
#   注解:該程序用xml.sax模塊的parse函數(shù)解析XML,并生成事件
#   繼承ContentHandler并重寫其事件處理函數(shù)
#   Dispatcher主要用于相應(yīng)標(biāo)簽的起始、結(jié)束事件的派發(fā)
#---------------------------------------
from xml.sax.handler import ContentHandler
from xml.sax import parse

class Dispatcher:
    def dispatch(self, prefix, name, attrs=None):
        mname = prefix + name.capitalize()
        dname = 'default' + prefix.capitalize()
        method = getattr(self, mname, None)
        if callable(method): args = ()
        else:
            method = getattr(self, dname, None)
            #args = name
        #if prefix == 'start': args += attrs
        if callable(method): method()

    def startElement(self, name, attrs):
        self.dispatch('start', name, attrs)

    def endElement(self, name):
        self.dispatch('end', name)

class Website(Dispatcher, ContentHandler):

    def __init__(self):
        self.fout = open('ddt_SAX.html', 'w')
        self.imagein = False
        self.desflag = False
        self.item = False
        self.title = ''
        self.link = ''
        self.guid = ''
        self.url = ''
        self.pubdate = ''
        self.description = ''
        self.temp = ''
        self.prx = ''
    def startChannel(self):

        self.fout.write('''<html>\n<head>\n<title> RSS-''')

    def endChannel(self):
       self.fout.write('''
                    <tr><td height="20"></td></tr>
                    </table>
                    </center>
                    <script>
    function  GetTimeDiff(str)
    {
     if(str == '')
     {
      return '';
     }

     var pubDate = new Date(str);
     var nowDate = new Date();
     var diffMilSeconds = nowDate.valueOf()-pubDate.valueOf();
     var days = diffMilSeconds/86400000;
     days = parseInt(days);

     diffMilSeconds = diffMilSeconds-(days*86400000);
     var hours = diffMilSeconds/3600000;
     hours = parseInt(hours);

     diffMilSeconds = diffMilSeconds-(hours*3600000);
     var minutes = diffMilSeconds/60000;
     minutes = parseInt(minutes);

     diffMilSeconds = diffMilSeconds-(minutes*60000);
     var seconds = diffMilSeconds/1000;
     seconds = parseInt(seconds);

     var returnStr = "±±¾©·¢²¼Ê±¼ä£º" + pubDate.toLocaleString();

     if(days > 0)
     {
      returnStr = returnStr + "&nbsp;£¨¾àÀëÏÖÔÚ" + days + "Ìì" + hours + "Сʱ" + minutes + "·ÖÖÓ£©";
     }
     else if (hours > 0)
     {
      returnStr = returnStr + "&nbsp;£¨¾àÀëÏÖÔÚ" + hours + "Сʱ" + minutes + "·ÖÖÓ£©";
     }
     else if (minutes > 0)
     {
      returnStr = returnStr + "&nbsp;£¨¾àÀëÏÖÔÚ" + minutes + "·ÖÖÓ£©";
     }

     return returnStr;

    }

    function GetSpanText()
    {
     var pubDate;
     var pubDateArray;
     var spanArray = document.getElementsByTagName("span");

     for(var i = 0; i < spanArray.length; i++)
     {
      pubDate = spanArray[i].innerHTML;
      document.getElementsByTagName("span")[i].innerHTML = GetTimeDiff(pubDate);   
     }
    }

    GetSpanText();
   </script>
                </body>
                </html>
                ''')
       self.fout.close()

    def characters(self, chars):
        if chars.strip():
            #chars = chars.strip()
            self.temp += chars
            #print self.temp

      
    def startTitle(self):

        if self.item:
            self.fout.write('''
                        <tr bgcolor="#eeeeee">\n<td style="padding-top:5px;padding-left:5px;" height="30">\n<B>
                    ''')

    def endTitle(self):

        if not self.imagein and not self.item:
            self.title = self.temp
            self.temp = ''
            self.fout.write(self.title.encode('gb2312'))

            #self.title = self.temp
            self.fout.write('''
                </title>\n</head>\n<body>\n<center>\n
                <script>\n

                        function copyLink()
                        {
                                clipboardData.setData("Text",window.location.href);
                                alert("RSSÁ´½ÓÒѾ­¸´ÖƵ½¼ôÌù°å");
                        }

                        function subscibeLink()
                        {
                                var str = window.location.pathname;
                                while(str.match(/^\//))
                                {
                                        str = str.replace(/^\//,"");
                                }
                                window.open("http://rss.sina.com.cn/my_sina_web_rss_news.html?url=" + str,"_self");

                        }
                        </script>\n
                <table width="750" cellpadding="0" cellspacing="0">\n
                <tr>\n
                <td align="right" style="padding-right:15px;" valign="bottom">\n
            ''')

        if self.item:
            self.title = self.temp
            self.temp = ''
            self.fout.write(self.title.encode('gb2312'))
            self.fout.write('''
                        </B>
                        </td>
                        </tr>
                        <tr bgcolor="#eeeeee">
                        <td style="padding-left:5px;">
                        ''')

    def startImage(self):
        self.imagein = True

    def endImage(self):
        self.imagein = False

    def startLink(self):
        if self.imagein:
            self.fout.write('''<A href=" ''')

           
    def endLink(self):
        self.link = self.temp
        self.temp = ''
        if self.imagein:
            self.fout.write(self.link.encode('gb2312'))
            self.fout.write('''" target="_blank">\n ''')
        elif self.item:
            #self.link = self.temp
            pass
        else:
            self.fout.write(self.link)
            self.fout.write(''' " target="
      _blank
     "> ''')
            self.fout.write(self.title.encode('gb2312'))
            self.fout.write(''' </A></B></td>
                            </tr>
                            <tr><td colspan="2" align="center">
                            ''')
            self.fout.write(self.description.encode('gb2312'))
            self.fout.write('''
                        </td></tr>
                        <tr style="font-size:12px;" bgcolor="#eeeeff"><td colspan="2" style="font-size:14px;padding-top:5px;padding-bottom:5px;"><b><a href="javascript:copyLink();">¸´ÖÆ´ËÒ³Á´½Ó</a>                <a href="javascript:subscibeLink();">ÎÒҪǶÈë¸ÃÐÂÎÅÁÐ±íµ½ÎÒµÄÒ³Ãæ£¨¼òµ¥¡¢¿ìËÙ¡¢ÊµÊ±¡¢Ãâ·Ñ£©</a></b></td></tr>
                        </table>
                        <table width="750" cellpadding="0" cellspacing="0">
                            ''')

    def startUrl(self):
        if self.imagein:
            self.fout.write('''<IMG src=" ''')
    def endUrl(self):
        self.url = self.temp
        self.temp = ''
        if self.imagein:
            self.fout.write(self.url.encode('gb2312'))
            self.fout.write('''" border="0">\n
                            </A>
                            </td>
                            <td align="left" valign="bottom" style="padding-bottom:8px;"><B><A href="
                            ''')
        if self.item:
            #self.url = self.temp
            pass

    def defaultStart(self):
        pass
    def defaultEnd(self):
        self.temp = ''
    def startDescription(self):
        pass
    def endDescription(self):
        self.description = self.temp
        self.temp = ''
        if self.item:
            #self.fout.write('¡¡¡¡')
            self.fout.write(self.description.encode('gb2312'))

    def endGuid(self):
        self.guid = self.temp
    def endPubdate(self):
        if not self.temp.startswith('http'):
         self.pubdate = self.temp
         self.temp = ''
        else:
            self.pubdate = ''
    def startItem(self):
        self.item = True
    def endItem(self):
        self.item = False
        self.fout.write('''
                            </td>
                            </tr>
                            <tr bgcolor="#eeeeee">
                            <td style="padding-top:5px;padding-left:5px;">
                            <A href="''')
        self.fout.write(self.link)
        self.fout.write(''' " target="_blank"> ''')
        self.fout.write(self.guid)
        self.fout.write('''
                        </A>
                        </td>
                        </tr>
                        <tr bgcolor="#eeeeee">
                        <td style="padding-top:5px;padding-left:5px;padding-bottom:5px;"><span>''')
        self.fout.write(self.pubdate)
        self.fout.write('''</span></td>
                        </tr>
                        <tr height="10"><td></td></tr>''')

#程序入口
if __name__ == '__main__':
    parse('ddt.xml', Website())

相關(guān)文章

  • 使用XML庫(kù)的方式,實(shí)現(xiàn)RPC通信的方法(推薦)

    使用XML庫(kù)的方式,實(shí)現(xiàn)RPC通信的方法(推薦)

    下面小編就為大家?guī)硪黄褂肵ML庫(kù)的方式,實(shí)現(xiàn)RPC通信的方法(推薦)。小編覺得挺不錯(cuò)的,現(xiàn)在就分享給大家,也給大家做個(gè)參考。一起跟隨小編過來看看吧
    2017-06-06
  • 基于python cut和qcut的用法及區(qū)別詳解

    基于python cut和qcut的用法及區(qū)別詳解

    今天小編就為大家分享一篇基于python cut和qcut的用法及區(qū)別詳解,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過來看看吧
    2019-11-11
  • Python賦值邏輯的實(shí)現(xiàn)

    Python賦值邏輯的實(shí)現(xiàn)

    本文主要介紹了 Python賦值邏輯的實(shí)現(xiàn),文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧
    2023-02-02
  • python根據(jù)路徑導(dǎo)入模塊的方法

    python根據(jù)路徑導(dǎo)入模塊的方法

    這篇文章主要介紹了python根據(jù)路徑導(dǎo)入模塊的方法,分析了傳統(tǒng)方法與改進(jìn)方法,具有一定的實(shí)用價(jià)值,需要的朋友可以參考下
    2014-09-09
  • 用Python編寫生成樹狀結(jié)構(gòu)的文件目錄的腳本的教程

    用Python編寫生成樹狀結(jié)構(gòu)的文件目錄的腳本的教程

    這篇文章主要介紹了用Python編寫生成樹狀結(jié)構(gòu)的文件目錄的腳本的教程,是一個(gè)利用os模塊下各函數(shù)的簡(jiǎn)單實(shí)現(xiàn),需要的朋友可以參考下
    2015-05-05
  • Python爬蟲+Tkinter制作一個(gè)翻譯軟件的示例

    Python爬蟲+Tkinter制作一個(gè)翻譯軟件的示例

    這篇文章主要介紹了Python爬蟲+Tkinter制作一個(gè)翻譯軟件的示例,幫助大家更好的理解和學(xué)習(xí)使用python,感興趣的朋友可以了解下
    2021-02-02
  • Python中自定義函方法與參數(shù)具有默認(rèn)值的函數(shù)

    Python中自定義函方法與參數(shù)具有默認(rèn)值的函數(shù)

    這篇文章主要介紹了Python中自定義函方法與參數(shù)具有默認(rèn)值的函數(shù),在Python編程中,可以使用已經(jīng)定義好的函數(shù),也可以自定義函數(shù)實(shí)現(xiàn)某些特殊的功能,更多相關(guān)資料,請(qǐng)需要的人參考下面文章內(nèi)容
    2022-02-02
  • 樹莓派極簡(jiǎn)安裝OpenCv的方法步驟

    樹莓派極簡(jiǎn)安裝OpenCv的方法步驟

    這篇文章主要介紹了樹莓派極簡(jiǎn)安裝OpenCv的方法步驟,文中通過示例代碼介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或者工作具有一定的參考學(xué)習(xí)價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)學(xué)習(xí)吧
    2019-10-10
  • python數(shù)字圖像處理實(shí)現(xiàn)直方圖與均衡化

    python數(shù)字圖像處理實(shí)現(xiàn)直方圖與均衡化

    在圖像處理中,直方圖是非常重要,也是非常有用的一個(gè)處理要素。這篇文章主要介紹了python數(shù)字圖像處理實(shí)現(xiàn)直方圖與均衡化,小編覺得挺不錯(cuò)的,現(xiàn)在分享給大家,也給大家做個(gè)參考。一起跟隨小編過來看看吧
    2018-05-05
  • Python3 集合set入門基礎(chǔ)

    Python3 集合set入門基礎(chǔ)

    集合也也也也是python內(nèi)置的一種數(shù)據(jù)結(jié)構(gòu),它是一個(gè)無序且元素不重復(fù)的序列。這里有兩個(gè)關(guān)鍵詞一個(gè)是無序,這一點(diǎn)和字典是一樣的,另一個(gè)關(guān)鍵詞是元素不重復(fù),這一點(diǎn)和字典的key(鍵)是一樣的
    2020-02-02

最新評(píng)論