Data Encoding Formats: XML vs JSON vs YAML

What You’ll Learn

Overview of data encoding formats
XML vs. JSON vs. YAML
Demo JSON vs. YAML

What You’ll Need

A basic background in a coding language such as Python and a basic understanding of APIs

For network engineers, we are used to seeing formatted configuration using indentation such as this snippet for an interface:

cisco#show run interface GigabitEthernet 1
Building configuration...

Current configuration : 146 bytes
!
interface GigabitEthernet1
 vrf forwarding MANAGEMENT
 ip address 10.0.0.151 255.255.255.0
 negotiation auto
 no mop enabled
 no mop sysid
end

The problem with the configuration formatted above is that it is difficult for coding languages to natively parse the data and understand the relationships between the different components. There is an inherent ambiguity that the human mind is able to fill in the gaps through knowledge of networking protocols and how IOS configuration is displayed. The limitations of formatted data that does not have a programmatic structure to define the relationships between objects is a lot clearer in larger examples—say, if you wanted to loop more than 100 devices with each device having dozens of interfaces with multiple IP addresses.

Structured data means having the data encoded in such a way that a programming language can load it in and understand the key/value pairs, the lists, and so on. Just like in networking, data needs to be encoded to be sent over the wire, and the particular bits have significance. In programming, there are data encoding formats.

The three most common data encoding formats are XML, JSON, and YAML. These data encoding formats do not change the data fundamentally; they merely provide formatting constraints to give coding languages the structure needed to load in the data to then be manipulated or processed.

There are pros and cons for each of them, and usually you are given data by an API from a network device such as NETCONF or RESTCONF or are given data from an application such as a network controller like Cisco DNA Center or Cisco Network Services Orchestrator (NSO). When you make the API call, you are often given the option to ask for the data in either XML or JSON. You are almost never writing XML or JSON from scratch but are rather given a data payload as an output from some other interaction with an application or piece of hardware.

XML

XML is the oldest of the three options, but it has the most features for advanced data processing. XML uses open and close tags to indicate where data starts and ends, such as <data>payload</data>. Converting our original example of an interface into XML goes from this:

interface GigabitEthernet1
 vrf forwarding MANAGEMENT
 ip address 10.0.0.151 255.255.255.0
 negotiation auto
 no mop enabled
 no mop sysid
end

To this:

<GigabitEthernet>
  <name>1</name>
  <vrf>
    <forwarding>MANAGEMENT</forwarding>
  </vrf>
  <ip>
    <address>
      <primary>
        <address>10.0.0.151</address>
        <mask>255.255.255.0</mask>
      </primary>
    </address>
</ip>
<mop>
  <enabled>false</enabled>
  <sysid>false</sysid>
</mop>
</GigabitEthernet>

The indentation in XML is merely for readability; the opening and closing tags are the most important parts (closing tags have the / in them):

<GigabitEthernet>
...
</GigabitEthernet>

There are other features of XML, such as tags, namespace, XPATH, and attributes, that are outside the scope of this simple tutorial.

One of the benefits of XML is that there are features that allow the designers of the XML payload to include metadata about the schema and other attributes of the data that is not possible to do in JSON or YAML.

The main building block of XML is called an XML “element,” which is the opening and closing tags and any data between the tags:

<address>10.0.0.151</address>

Some examples of applications that use XML:

NETCONF (used by Cisco IOS XE, IOS XR, etc.)
Cisco DNA Center
Cisco NSO

JSON

JSON is very common in network automation. It does not have some of the advanced features of XML, but it is a bit more human readable and less verbose because it does not have the same opening and closing tags but rather a series of {}, [] and "" (for keys and strings) to format the data. For engineers who have worked with JavaScript or Python, there are a lot of similarities in the syntax to Python lists and dictionaries.

Here is a sample snippet of the same interface config, but now in JSON:

{
  "Cisco-IOS-XE-native:GigabitEthernet": {
      "name": "1",
      "vrf": {
          "forwarding": "MANAGEMENT"
      },
      "ip": {
          "address": {
              "primary": {
                  "address": "10.0.0.151",
                  "mask": "255.255.255.0"
              }
          }
      },
      "mop": {
          "enabled": false,
          "sysid": false
      }
   }
}

JSON indentation is also just for readability. The main thing to notice is that there are "key": "value" pairs separated by , characters. This particular example does not have a list object, but a list would use the [] characters instead of the {} used in this example for dictionary items. A JSON list would look like this:

{
  "list_name": [
    "first-item",
    "second item"
  ]
}

Some examples of applications that use JSON:

RESTCONF (used by Cisco IOS XE, IOS XR, etc.)
Most application REST APIs
Red Hat Ansible

YAML

YAML is even more common in network automation because it is very human readable. Unlike JSON and XML, it is very common for engineers to write their own YAML files or update existing ones manually. If YAML is not being crafted for a configuration file or a playbook, it is often converted from a JSON output to make it more readable and easier to edit if needed.

The same interface configuration in YAML looks like this:

---
Cisco-IOS-XE-native:GigabitEthernet:
  name: '1'
  vrf:
    forwarding: MANAGEMENT
  ip:
    address:
      primary:
        address: 10.0.0.151
        mask: 255.255.255.0
  mop:
    enabled: false
    sysid: false

The three hyphens at the top of the snippet indicate a new YAML document ---. Indentation is very important in YAML because it is used to indicate the hierarchy of the relationship between objects, such as:

  ip:
    address:
      primary:
        address: 10.0.0.151

This says that there are three dictionary key lookups, with three layers of nested objects to extract the value of the address from ip -> address -> primary -> address. Putting quotation marks around strings is optional though highly recommended because there may be special characters in the strings that break parsing engines.

A YAML list uses hyphens to denote list items:

---
list_name:
- first-item
- second item

Some examples of applications that use YAML:

Red Hat Ansible
Kubernetes
GitLab CI

A helpful tip for network engineers getting started is to play around with JSON and YAML data as you get it in your automation learning. The easiest way to do that is to have a visualization tool that has them side by side, like the free website JSON2YAML. You can take your RESTCONF JSON output or your Ansible YAML variables and see them both side by side. This is especially helpful when you have system-generated outputs that are heavily nested JSON and you want a bird’s eye view of the structure of the data by seeing it in YAML.

As some practice, try taking the following Nexus VLAN JSON (parsed by an NSO NED into JSON) and add another VLAN, id 106, with a name of “lab”:

{
    "tailf-ned-cisco-nx:vlan": {
        "vlan-list": [
            { "id": 1 },
            {
                "id": 101,
                "name": "prod"
            },
            {
                "id": 102,
                "name": "dev"
            },
            {
                "id": 103,
                "name": "test"
            },
            {
                "id": 104,
                "name": "security"
            },
            {
                "id": 105,
                "name": "iot"
            },
            {
                "id": 1337,
                "name": "SVI-DEMO"
            }
        ]
    }
}

Is it easier to work with seeing the YAML version next to it? Also, try adding a list item on the YAML side with a VLAN ID and name of your choice.

What You’ll Learn

What You’ll Need

XML

JSON

YAML

Learn More