Python XML Processing
XML (eXtensible Markup Language) is a widely used format for storing and transporting structured data.
It is commonly used in:
- Web services (SOAP APIs)
- Configuration files
- Data exchange systems
- Mobile and desktop applications
- Document storage
Python provides built-in modules to easily work with XML data.
What is XML?
XML is a markup language used to define data in a structured format.
Example XML:
<user>
<name>John</name>
<age>30</age>
<city>New York</city>
</user>Why Use XML?
- Platform independent
- Human-readable
- Structured data format
- Widely supported in APIs
- Easy data sharing between systems
Python XML Modules
Python provides two main modules:
- xml.etree.ElementTree (recommended)
- xml.dom.minidom
We will focus on ElementTree.
1. Parsing XML in Python
import xml.etree.ElementTree as ET
xml_data = """
<user>
<name>John</name>
<age>30</age>
<city>New York</city>
</user>
"""
root = ET.fromstring(xml_data)
print(root.tag)Output
userAccessing XML Elements
print(root.find("name").text)
print(root.find("age").text)Output:
John
302. Reading XML from File
Example XML File (data.xml)
<users>
<user>
<name>Alice</name>
<age>25</age>
</user>
</users>Python Code:
import xml.etree.ElementTree as ET
tree = ET.parse("data.xml")
root = tree.getroot()
for user in root.findall("user"):
name = user.find("name").text
age = user.find("age").text
print(name, age)3. Creating XML in Python
import xml.etree.ElementTree as ET
root = ET.Element("users")
user = ET.SubElement(root, "user")
name = ET.SubElement(user, "name")
age = ET.SubElement(user, "age")
name.text = "John"
age.text = "30"
tree = ET.ElementTree(root)
tree.write("output.xml")Output XML
<users>
<user>
<name>John</name>
<age>30</age>
</user>
</users>4. Modifying XML Data
import xml.etree.ElementTree as ET
tree = ET.parse("data.xml")
root = tree.getroot()
for age in root.iter("age"):
age.text = "35"
tree.write("updated.xml")5. Adding New Elements
new_user = ET.SubElement(root, "user")
name = ET.SubElement(new_user, "name")
name.text = "Bob"6. Deleting XML Elements
for user in root.findall("user"):
name = user.find("name").text
if name == "Alice":
root.remove(user)7. Using xml.dom.minidom
For pretty printing XML:
from xml.dom import minidom
xml = "<root><name>John</name></root>"
dom = minidom.parseString(xml)
print(dom.toprettyxml())Output
<root>
<name>John</name>
</root>8. XML Attributes
import xml.etree.ElementTree as ET
root = ET.Element("user", id="101")
name = ET.SubElement(root, "name")
name.text = "John"
print(ET.tostring(root))9. Working with Namespaces
import xml.etree.ElementTree as ET
xml = """<ns:user xmlns:ns="http://example.com">
<ns:name>John</ns:name>
</ns:user>"""
root = ET.fromstring(xml)
print(root.tag)10. XPath in XML
import xml.etree.ElementTree as ET
tree = ET.parse("data.xml")
root = tree.getroot()
users = root.findall(".//user")
for user in users:
print(user.find("name").text)XML vs JSON
| Feature | XML | JSON |
|---|---|---|
| Format | Tag-based | Key-value |
| Readability | Medium | High |
| File size | Larger | Smaller |
| Speed | Slower | Faster |
| Usage | Legacy systems | Modern APIs |
Real-World Applications
XML is used in:
- Web services (SOAP APIs)
- Configuration files
- Microsoft Office files
- Android layouts
- Data exchange systems
- Enterprise applications
Best Practices
- Use ElementTree for simplicity
- Validate XML structure
- Avoid deep nesting
- Use namespaces properly
- Handle parsing errors
Common Errors
Parse Error
xml.etree.ElementTree.ParseErrorSolution:
- Check XML syntax
File Not Found
Ensure XML file path is correct.
Summary
Python provides powerful built-in tools for XML processing using modules like ElementTree and minidom. These tools allow developers to parse, create, and modify XML data efficiently.
Conclusion
XML remains an important data format in many systems. Python makes XML processing simple and efficient, allowing developers to handle structured data in web services, configuration files, and enterprise applications.


0 Comments