Reading XML file

To read XML we can use XmlReader class. XmlReader read lines of xml file one by one.
Imagine we have following XML file:

<?xml version="1.0"?>
<catalog>
  <book id="bk101">
    <author>Gambardella, Matthew</author>
    <title>XML Developer's Guide</title>
    <genre>Computer</genre>
    <price>44.95</price>
    <publish_date>2000-10-01</publish_date>
    <description>
      An in-depth look at creating applications
      with XML.
    </description>
  </book>
  <book id="bk102">
    <author>Ralls, Kim</author>
    <title>Midnight Rain</title>
    <genre>Fantasy</genre>
    <price>5.95</price>
    <publish_date>2000-12-16</publish_date>
    <description>
      A former architect battles corporate zombies,
      an evil sorceress, and her own childhood to become queen
      of the world.
    </description>
  </book>
  <book id="bk103">
    <author>Corets, Eva</author>
    <title>Maeve Ascendant</title>
    <genre>Fantasy</genre>
    <price>5.95</price>
    <publish_date>2000-11-17</publish_date>
    <description>
      After the collapse of a nanotechnology
      society in England, the young survivors lay the
      foundation for a new society.
    </description>
  </book>
</catalog>

We can parse it in the following way:

        static void Read(string path)
        {
            XmlReader reader = XmlReader.Create(path);
            while (reader.Read()) //read the next line of the XML file
            {
                if (reader.Name == "catalog") //checks whether current item name is catalog (i.e. (<catalog>)
                {
                    if (reader.NodeType == XmlNodeType.Element) //checks whether current item is an element (i.e. <book>)
                        Console.WriteLine("Reading catalog started.");
                    if (reader.NodeType == XmlNodeType.EndElement) // check whether current item is an end of element (i.e. </book>)
                        Console.WriteLine("Reading catalog finished.");
                }
                if (reader.Name == "book")
                {
                    if (reader.NodeType == XmlNodeType.Element)
                    {
                        Console.WriteLine("Reading book started.");
                        if (reader.HasAttributes)
                        {
                            var idAttribute = reader["id"]; //tries to read attribute (i.e. <book id=123>)
                            Console.WriteLine($"Id attribute is: {idAttribute}");
                            var firstAttribute = reader[0]; //tries to read first attribute of the element (in this case returns the same as reader["id"])
                            Console.WriteLine($"First attribute is: {firstAttribute}");
                            var secondAttribute = reader["abc"]; //tries to read abc attribute (which doesn't exist) of the element (in this case returns null)
                            Console.WriteLine($"Second attribute is: {secondAttribute}");
                        }
                    }
                    if (reader.NodeType == XmlNodeType.EndElement)
                        Console.WriteLine("Reading book finished.");
                }

                if (reader.Name == "title" && reader.NodeType == XmlNodeType.Element)
                {
                    var titleWithTags = reader.ReadOuterXml(); // reads title with tags
                    Console.WriteLine($"Title (with tags) is: {titleWithTags}");
                }

                if (reader.Name == "description" && reader.NodeType == XmlNodeType.Element)
                {
                    var description = reader.ReadInnerXml(); //reads the content of the <description> element
                    Console.WriteLine($"Description is: {description}");
                }

                if (reader.Name == "price" && reader.NodeType == XmlNodeType.Element)
                {
                    var price = reader.ReadElementContentAs(typeof(decimal), null);
                    Console.WriteLine($"Parsed price is: {price}");
                }
            }
        }

There is also second option – LINQ to XML. We can parse mentioned document like this:

        static void ReadByLinq(string path)
        {
            XDocument doc = XDocument.Load(path);
            var books = doc.Descendants().Where(x => x.NodeType == XmlNodeType.Element && x.Name == "book").ToList();
            Console.WriteLine($"There are {books.Count} books:");
            foreach (var b in books)
            {
                Console.WriteLine("Id attribute is: " + b.Attributes().FirstOrDefault(x => x.Name == "id").Value);
                Console.WriteLine("Title is: " + b.Elements().FirstOrDefault(x => x.NodeType == XmlNodeType.Element && x.Name == "title").Value);
            }
        }

Be aware of the fact that the second option loads the whole file into memory!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s