淺談使用Rapidxml 庫(kù)遇到的問(wèn)題和分析過(guò)程(分享)
C++解析xml的開(kāi)源庫(kù)有很多,在此我就不一一列舉了,今天主要說(shuō)下Rapidxml,我使用這個(gè)庫(kù)也并不是很多,如有錯(cuò)誤之處還望大家能夠之處,謝謝。
附:
官方鏈接:http://rapidxml.sourceforge.net/
官方手冊(cè):http://rapidxml.sourceforge.net/manual.html
之前有一次用到,碰到了個(gè)"坑",當(dāng)時(shí)時(shí)間緊迫并未及時(shí)查找,今天再次用到這個(gè)庫(kù),對(duì)這樣的"坑"不能踩第二次,因此我決定探個(gè)究竟。
先寫(xiě)兩段示例:
創(chuàng)建xm:
void CreateXml()
{
rapidxml::xml_document<> doc;
auto nodeDecl = doc.allocate_node(rapidxml::node_declaration);
nodeDecl->append_attribute(doc.allocate_attribute("version", "1.0"));
nodeDecl->append_attribute(doc.allocate_attribute("encoding", "UTF-8"));
doc.append_node(nodeDecl);//添加xml聲明
auto nodeRoot = doc.allocate_node(rapidxml::node_element, "Root");//創(chuàng)建一個(gè)Root節(jié)點(diǎn)
nodeRoot->append_node(doc.allocate_node(rapidxml::node_comment, NULL, "編程語(yǔ)言"));//添加一個(gè)注釋內(nèi)容到Root,注釋沒(méi)有name 所以第二個(gè)參數(shù)為NULL
auto nodeLangrage = doc.allocate_node(rapidxml::node_element, "language", "This is C language");//創(chuàng)建一個(gè)language節(jié)點(diǎn)
nodeLangrage->append_attribute(doc.allocate_attribute("name", "C"));//添加一個(gè)name屬性到language
nodeRoot->append_node(nodeLangrage); //添加一個(gè)language到Root節(jié)點(diǎn)
nodeLangrage = doc.allocate_node(rapidxml::node_element, "language", "This is C++ language");//創(chuàng)建一個(gè)language節(jié)點(diǎn)
nodeLangrage->append_attribute(doc.allocate_attribute("name", "C++"));//添加一個(gè)name屬性到language
nodeRoot->append_node(nodeLangrage); //添加一個(gè)language到Root節(jié)點(diǎn)
doc.append_node(nodeRoot);//添加Root節(jié)點(diǎn)到Document
std::string buffer;
rapidxml::print(std::back_inserter(buffer), doc, 0);
std::ofstream outFile("language.xml");
outFile << buffer;
outFile.close();
}
結(jié)果:
<?xml version="1.0" encoding="UTF-8"?> <Root> <!--編程語(yǔ)言--> <language name="C">This is C language</language> <language name="C++">This is C++ language</language> </Root>
修改xml:
void MotifyXml()
{
rapidxml::file<> requestFile("language.xml");//從文件加載xml
rapidxml::xml_document<> doc;
doc.parse<0>(requestFile.data());//解析xml
auto nodeRoot = doc.first_node();//獲取第一個(gè)節(jié)點(diǎn),也就是Root節(jié)點(diǎn)
auto nodeLanguage = nodeRoot->first_node("language");//獲取Root下第一個(gè)language節(jié)點(diǎn)
nodeLanguage->first_attribute("name")->value("Motify C");//修改language節(jié)點(diǎn)的name屬性為 Motify C
std::string buffer;
rapidxml::print(std::back_inserter(buffer), doc, 0);
std::ofstream outFile("MotifyLanguage.xml");
outFile << buffer;
outFile.close();
}
結(jié)果:
<Root> <language name="Motify C">This is C language</language> <language name="C++">This is C++ language</language> </Root>
由第二個(gè)結(jié)果得出:
第一個(gè)language的name屬性確實(shí)改成我們所期望的值了,不過(guò)不難發(fā)現(xiàn)xml的聲明和注釋都消失了。是怎么回事呢?這個(gè)問(wèn)題也困擾了我一段時(shí)間,既然是開(kāi)源庫(kù),那我們跟一下看看他都干了什么,從代碼可以看出可疑的地方主要有兩處:print和parse,這兩個(gè)函數(shù)均需要提供一個(gè)flag,這個(gè)flag到底都干了什么呢,從官方給的教程來(lái)看 均使用的0,既然最終執(zhí)行的是print我們就從print開(kāi)始調(diào)試跟蹤吧
找到了找到print調(diào)用的地方:
template<class OutIt, class Ch>
inline OutIt print(OutIt out, const xml_node<Ch> &node, int flags = 0)
{
return internal::print_node(out, &node, flags, 0);
}
繼續(xù)跟蹤:
// Print node
template<class OutIt, class Ch>
inline OutIt print_node(OutIt out, const xml_node<Ch> *node, int flags, int indent)
{
// Print proper node type
switch (node->type())
{
// Document
case node_document:
out = print_children(out, node, flags, indent);
break;
// Element
case node_element:
out = print_element_node(out, node, flags, indent);
break;
// Data
case node_data:
out = print_data_node(out, node, flags, indent);
break;
// CDATA
case node_cdata:
out = print_cdata_node(out, node, flags, indent);
break;
// Declaration
case node_declaration:
out = print_declaration_node(out, node, flags, indent);
break;
// Comment
case node_comment:
out = print_comment_node(out, node, flags, indent);
break;
// Doctype
case node_doctype:
out = print_doctype_node(out, node, flags, indent);
break;
// Pi
case node_pi:
out = print_pi_node(out, node, flags, indent);
break;
// Unknown
default:
assert(0);
break;
}
// If indenting not disabled, add line break after node
if (!(flags & print_no_indenting))
*out = Ch('\n'), ++out;
// Return modified iterator
return out;
}
跟進(jìn)print_children 發(fā)現(xiàn)這實(shí)際是個(gè)遞歸,我們繼續(xù)跟蹤
// Print element node
template<class OutIt, class Ch>
inline OutIt print_element_node(OutIt out, const xml_node<Ch> *node, int flags, int indent)
{
assert(node->type() == node_element);
// Print element name and attributes, if any
if (!(flags & print_no_indenting))
...//省略部分代碼
return out;
}
我們發(fā)現(xiàn)第8行有一個(gè)&判斷 查看print_no_indenting的定義:
// Printing flags const int print_no_indenting = 0x1; //!< Printer flag instructing the printer to suppress indenting of XML. See print() function.
據(jù)此我們就可以分析了,按照開(kāi)發(fā)風(fēng)格統(tǒng)一的思想,parse也應(yīng)該有相同的標(biāo)志定義
省略分析parse流程..
我也順便去查看了官方文檔,確實(shí)和我預(yù)想的一樣,貼一下頭文件中對(duì)這些標(biāo)志的描述,詳細(xì)信息可參考官方文檔
// Parsing flags //! Parse flag instructing the parser to not create data nodes. //! Text of first data node will still be placed in value of parent element, unless rapidxml::parse_no_element_values flag is also specified. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_no_data_nodes = 0x1; //! Parse flag instructing the parser to not use text of first data node as a value of parent element. //! Can be combined with other flags by use of | operator. //! Note that child data nodes of element node take precendence over its value when printing. //! That is, if element has one or more child data nodes <em>and</em> a value, the value will be ignored. //! Use rapidxml::parse_no_data_nodes flag to prevent creation of data nodes if you want to manipulate data using values of elements. //! <br><br> //! See xml_document::parse() function. const int parse_no_element_values = 0x2; //! Parse flag instructing the parser to not place zero terminators after strings in the source text. //! By default zero terminators are placed, modifying source text. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_no_string_terminators = 0x4; //! Parse flag instructing the parser to not translate entities in the source text. //! By default entities are translated, modifying source text. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_no_entity_translation = 0x8; //! Parse flag instructing the parser to disable UTF-8 handling and assume plain 8 bit characters. //! By default, UTF-8 handling is enabled. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_no_utf8 = 0x10; //! Parse flag instructing the parser to create XML declaration node. //! By default, declaration node is not created. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_declaration_node = 0x20; //! Parse flag instructing the parser to create comments nodes. //! By default, comment nodes are not created. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_comment_nodes = 0x40; //! Parse flag instructing the parser to create DOCTYPE node. //! By default, doctype node is not created. //! Although W3C specification allows at most one DOCTYPE node, RapidXml will silently accept documents with more than one. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_doctype_node = 0x80; //! Parse flag instructing the parser to create PI nodes. //! By default, PI nodes are not created. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_pi_nodes = 0x100; //! Parse flag instructing the parser to validate closing tag names. //! If not set, name inside closing tag is irrelevant to the parser. //! By default, closing tags are not validated. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_validate_closing_tags = 0x200; //! Parse flag instructing the parser to trim all leading and trailing whitespace of data nodes. //! By default, whitespace is not trimmed. //! This flag does not cause the parser to modify source text. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_trim_whitespace = 0x400; //! Parse flag instructing the parser to condense all whitespace runs of data nodes to a single space character. //! Trimming of leading and trailing whitespace of data is controlled by rapidxml::parse_trim_whitespace flag. //! By default, whitespace is not normalized. //! If this flag is specified, source text will be modified. //! Can be combined with other flags by use of | operator. //! <br><br> //! See xml_document::parse() function. const int parse_normalize_whitespace = 0x800; // Compound flags //! Parse flags which represent default behaviour of the parser. //! This is always equal to 0, so that all other flags can be simply ored together. //! Normally there is no need to inconveniently disable flags by anding with their negated (~) values. //! This also means that meaning of each flag is a <i>negation</i> of the default setting. //! For example, if flag name is rapidxml::parse_no_utf8, it means that utf-8 is <i>enabled</i> by default, //! and using the flag will disable it. //! <br><br> //! See xml_document::parse() function. const int parse_default = 0; //! A combination of parse flags that forbids any modifications of the source text. //! This also results in faster parsing. However, note that the following will occur: //! <ul> //! <li>names and values of nodes will not be zero terminated, you have to use xml_base::name_size() and xml_base::value_size() functions to determine where name and value ends</li> //! <li>entities will not be translated</li> //! <li>whitespace will not be normalized</li> //! </ul> //! See xml_document::parse() function. const int parse_non_destructive = parse_no_string_terminators | parse_no_entity_translation; //! A combination of parse flags resulting in fastest possible parsing, without sacrificing important data. //! <br><br> //! See xml_document::parse() function. const int parse_fastest = parse_non_destructive | parse_no_data_nodes; //! A combination of parse flags resulting in largest amount of data being extracted. //! This usually results in slowest parsing. //! <br><br> //! See xml_document::parse() function. const int parse_full = parse_declaration_node | parse_comment_nodes | parse_doctype_node | parse_pi_nodes | parse_validate_closing_tags;
根據(jù)以上提供的信息我們改下之前的源代碼:
將
doc.parse<0>(requestFile.data());//解析xml
auto nodeRoot = doc.first_node("");//獲取第一個(gè)節(jié)點(diǎn),也就是Root節(jié)點(diǎn)
改為
doc.parse<rapidxml::parse_declaration_node | rapidxml::parse_comment_nodes | rapidxml::parse_non_destructive>(requestFile.data());//解析xml
auto nodeRoot = doc.first_node("Root");//獲取第一個(gè)節(jié)點(diǎn),也就是Root節(jié)點(diǎn)
這里解釋一下,parse加入了三個(gè)標(biāo)志,分別是告訴解析器創(chuàng)建聲明節(jié)點(diǎn)、告訴解析器創(chuàng)建注釋節(jié)點(diǎn)、和不希望解析器修改傳進(jìn)去的數(shù)據(jù),第二句是當(dāng)有xml的聲明時(shí),默認(rèn)的first_node并不是我們期望的Root節(jié)點(diǎn),因此通過(guò)傳節(jié)點(diǎn)名來(lái)找到我們需要的節(jié)點(diǎn)。
注:
1、這個(gè)庫(kù)在append的時(shí)候并不去判斷添加項(xiàng)(節(jié)點(diǎn)、屬性等)是否存在
2、循環(huán)遍歷時(shí)對(duì)項(xiàng)(節(jié)點(diǎn)、屬性等)進(jìn)行修改會(huì)導(dǎo)致迭代失效
總結(jié):用別人寫(xiě)的庫(kù),總會(huì)有些意想不到的問(wèn)題,至今我只遇到了這些問(wèn)題,如果還有其它問(wèn)題歡迎補(bǔ)充,順便解釋下"坑"并不一定是用的開(kāi)源庫(kù)有問(wèn)題,更多的時(shí)候可能是還沒(méi)有熟練的去使用這個(gè)工具。
感謝rapidxml的作者,為我們提供一個(gè)如此高效便利的工具。
以上這篇淺談使用Rapidxml 庫(kù)遇到的問(wèn)題和分析過(guò)程(分享)就是小編分享給大家的全部?jī)?nèi)容了,希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
相關(guān)文章
Vs?Code中C/C++配置launch.json和tasks.json文件詳細(xì)步驟
使用VSCode開(kāi)發(fā)C/C++程序,需要配置tasks.json/launch.json,下面這篇文章主要給大家介紹了關(guān)于Vs?Code中C/C++配置launch.json和tasks.json文件的相關(guān)資料,文中通過(guò)圖文介紹的非常詳細(xì),需要的朋友可以參考下2024-01-01
C語(yǔ)言庫(kù)函數(shù)qsort的使用及模擬實(shí)現(xiàn)
這篇文章主要介紹了C語(yǔ)言庫(kù)函數(shù)qsort的使用及模擬實(shí)現(xiàn),文章圍繞主題展開(kāi)詳細(xì)的內(nèi)容介紹,具有一定的參考價(jià)值,需要的小伙伴可以參考一下2022-08-08
Ubuntu配置sublime text 3的c編譯環(huán)境的具體步驟
下面小編就為大家?guī)?lái)一篇Ubuntu配置sublime text 3的c編譯環(huán)境的具體步驟。小編覺(jué)得挺不錯(cuò)的,現(xiàn)在就分享給大家,也給大家做個(gè)參考。一起跟隨小編過(guò)來(lái)看看吧2017-03-03
C語(yǔ)言簡(jiǎn)單實(shí)現(xiàn)快速排序
快速排序是一種不穩(wěn)定排序,這篇文章主要為大家詳細(xì)介紹了C語(yǔ)言簡(jiǎn)單實(shí)現(xiàn)快速排序,具有一定的參考價(jià)值,感興趣的小伙伴們可以參考一下2019-01-01
C++實(shí)現(xiàn)LeetCode(126.詞語(yǔ)階梯之二)
這篇文章主要介紹了C++實(shí)現(xiàn)LeetCode(126.詞語(yǔ)階梯之二),本篇文章通過(guò)簡(jiǎn)要的案例,講解了該項(xiàng)技術(shù)的了解與使用,以下就是詳細(xì)內(nèi)容,需要的朋友可以參考下2021-07-07
C語(yǔ)言接口與實(shí)現(xiàn)方法實(shí)例詳解
這篇文章主要介紹了C語(yǔ)言接口與實(shí)現(xiàn)方法,包括接口的概念、實(shí)現(xiàn)方法及抽象數(shù)據(jù)類型等,并配合實(shí)例予以說(shuō)明,需要的朋友可以參考下2014-09-09

