python – How to add xml header to dom object – Education Career Blog

I’m using Python’s xml.dom.minidom but I think the question is valid for any DOM parser.

My original file has a line like this at the beginning:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>

This doesn’t seem to be part of the dom, so when I do something like dom.toxml() the resulting string have not line at the beginning.

How can I add it?

example outpupt:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Root xmlns:aid="http://xxxxxxxxxxxxxxxxxx">
<Section>BANDSAW BLADES</Section>
</Root>

hope to be clear.

,

This doesn’t seem to be part of the dom

The XML Declaration doesn’t get a node of its own, no, but the properties declared in it are visible on the Document object:

>>> doc= minidom.parseString('<?xml version="1.0" encoding="utf-8" standalone="yes"?><a/>')
>>> doc.encoding
'utf-8'
>>> doc.standalone
True

Serialising the document should include the standalone="yes" part of the declaration, but toxml() doesn’t. You could consider this a bug, perhaps, but really the toxml() method doesn’t make any promises to serialise the XML declaration in an appropriate way. (eg you don’t get an encoding unless you specifically ask for it either.)

You could take charge of writing the document yourself:

xml= 
xml.append('<?xml version="1.0" encoding="utf-8" standalone="yes"?>')
for child in doc.childNodes:
    xml.append(child.toxml())

but do you really need the XML Declaration here? You are using the default version and encoding, and since you have no DOCTYPE there can be no externally-defined entities, so the document is already standalone by nature. As per the XML standard: “if there are no external markup declarations, the standalone document declaration has no meaning”. It seems to me you could safely omit it completely.

Leave a Comment