Bootstrap

【python】解决xml读取注释以及标签自闭合问题

解决xml读取注释

想读取类似于下面的这种格式的注释。

<!--This is a comment-->

方法一:

from xml.etree import ElementTree
 
class CommentedTreeBuilder(ElementTree.TreeBuilder):
    def __init__(self, *args, **kwargs):
        super(CommentedTreeBuilder, self).__init__(*args, **kwargs)
       
 
    def comment(self, data):
        self.start(ElementTree.Comment, {})
        self.data(data)
        self.end(ElementTree.Comment)

# 
import xml.etree.ElementTree as ET
parser = ET.XMLParser(target=CommentedTreeBuilder())

方法二:

import lxml.etree as ET 
tree = ET.parse(filename)

解决xml标签自闭合问题

当xml标签text内容为None时,会自动闭合标签 <br/>
解决该问题

for node in tree.iter():
    if node.text is None:
        node.text = ''

将None部分赋值为空字符串就好了,会变成<br></br>

;