Contact Us 1-800-596-4880

XML Format

MIME type: application/xml

ID: xml

The XML data structure is mapped to DataWeave objects that can contain other objects, strings, or null values. XML uses unbounded elements to represent collections, which are mapped to repeated keys in DataWeave objects. In addition, DataWeave natively supports namespaces, CData an xsi:types.

The DataWeave reader for XML input supports the following parsing strategies:

  • Indexed

  • In-Memory

  • Streaming

To understand the parsing strategies that DataWeave readers and writers can apply to this format, see DataWeave Parsing Strategies.

CData Custom Type

CData is a custom DataWeave data type for XML that is used to identify a character data (CDATA) block. The CData type makes the XML writer wrap content inside a CDATA block or to check for an input string inside a CDATA block. In DataWeave, CData inherits from the type String.

DocType Custom Type

DocType is a custom DataWeave data type for XML that is used to identify a doctype declaration (DTD).

Read and Write DTDs

Introduced in DataWeave 2.5.0. Supported by Mule 4.5.0 and later.

DataWeave can read and write doctype declarations in XML files.

To read a doctype declarations, set the system property com.mulesoft.dw.xml_reader.parseDtd.

During the reading phase, DataWeave parses the doctype declarations and stores the content as metadata of the root element. The DocType value is stored in the docType variable. You can use the metadata selector (^) to extract the value of that variable with an expression like this one:

You can use the metadata selector to access it.

NOTE

DTDs are disabled by default even if their declarations are read or written. For details, see XML reader property supportDtd in Reader Properties.

Examples

The following examples show uses of the XML format.

Refer to DataWeave Formats for more detail on available reader and writer properties for various data formats.

Example: Stream Input XML Data

This example shows how to set up XML streaming, which requires you to specify the following reader properties:

  • streaming=true

  • collectionPath="root.repeated"

The collectionPath setting selects the elements to stream.

When streaming, the XML parser can start processing content without having all the XML content.

Input

The following XML serves as the input payload to the DataWeave source. Assume that it is the content of an XML file myXML.xml.

myXML.xml
<root>
    <text>
        Text
    </text>
    <repeated>
        <user>
            <name>Mariano</name>
            <lastName>de Achaval</lastName>
            <age>36</age>
        </user>
        <user>
            <lastName>Shokida</lastName>
            <name>Leandro</name>
            <age>30</age>
        </user>
        <user>
            <age>29</age>
            <name>Ana</name>
            <lastName>Felissati</lastName>

        </user>
        <user>
            <age>29</age>
            <lastName>Chibana</lastName>
            <name>Christian</name>
        </user>

    </repeated>
</root>

Source

The reader property settings in the DataWeave script tell the XML reader to stream the input and process the repeated keys. The script uses the DataWeave map function to iterate over the repeated keys.

%dw 2.0
var myInput  readUrl('classpath://myXML.xml', 'application/xml', {streaming:true, collectionPath: "root.repeated"})
output application/dw
---
myInput.root.repeated.*user map {
    n: $.name,
    l: $.lastName,
    a: $.age
}

Output

The script transforms the mapped input XML to the DataWeave (dw) format and MIME type.

[
  {
    "n": "Mariano",
    "l": "de Achaval",
    "a": "36"
  },
  {
    "n": "Leandro",
    "l": "Shokida",
    "a": "30"
  },
  {
    "n": "Ana",
    "l": "Felissati",
    "a": "29"
  },
  {
    "n": "Christian",
    "l": "Chibana",
    "a": "29"
  }
]

Example: Null or Empty String in XML

Because there is no standard way to represent a null value in XML, the reader maps the value to null when the nil attribute is set to true.

Input

The following XML serves as the input payload to the DataWeave source. Notice the nil setting in <xsi:nil="true"/>.

<book xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <author xsi:nil="true"/>
</book>

Source

The DataWeave script transforms the input XML to the JSON format and MIME type.

%dw 2.0
output application/json
---
payload

Output

The output is in the JSON format. Notice that the nil value in the input is transformed to null.

{
  "book": {
    "author": null
  }
}

Example: Output null Values for Missing XML Values

The XML reader property nullValueOn accepts the value blank (the default) or empty.

This example uses the nullValueOn default, so it maps the values of the title and author elements to null.

Input

The following XML serves as input to the DataWeave source. Notice that the title and author elements lack values.

Assume that this input is content within the file, myInput.xml.

myXML.xml content:
<book>
    <author></author>
    <title>

</title>
</book>

Source

The DataWeave script transforms the XML input to JSON. Note that it explicitly sets the nullValueOn default, blank, for demonstration purposes.

%dw 2.0
var myInput readUrl('classpath://myXML.xml', 'application/xml', {nullValueOn: "blank"})
output application/json
---
myInput

Output

The title and author keys in the JSON output are assigned null values.

{
  "book": {
    "author": null,
    "title": null
  }
}

Example: Output null Values for Missing XML Values

The XML reader property nullValueOn accepts the value blank (the default) or empty.

The example maps the value of the title element to a String and value of the author element to null because the XML reader property nullValueOn is set to empty.

Input

The following XML serves as input to the DataWeave source. Notice that the title and author elements lack values. The difference between the two elements is the space between the starting and ending tags. The tags of the title element are separated by line breaks (the hidden characters, \n), while the tags of the author element are not separated by any characters.

Assume that this input is content within the file, myInput.xml.

myXML.xml content:
<book>
    <author></author>
    <title>

</title>
</book>

Source

The DataWeave script uses a DataWeave variable to input the content of myXML.xml and applies the nullValueOn: "empty" to it. The script transforms the XML input to JSON.

%dw 2.0
var myInput readUrl('classpath://myXML.xml', 'application/xml', {nullValueOn: "empty"})
output application/json
---
myInput

Output

The JSON output maps the value of the author element to null and value of the title element to the String value "\n\n", which is for the new line characters.

{
  "book": {
    "author": null,
    "title": "\n\n"
  }
}

Example: Represent XML Attributes in the DataWeave (dw) Format

This example maps XML attributes to a canonical DataWeave representation, the application/dw format and MIME type.

Input

The XML serves as the input payload to the DataWeave source. Notice that the input contains XML attributes.

<users>
  <company>MuleSoft</company>
  <user name="Leandro" lastName="Shokida"/>
  <user name="Mariano" lastName="Achaval"/>
</users>

Source

The DataWeave script transforms the XML input payload to the DataWeave (dw) format and MIME type.

%dw 2.0
output application/dw
---
payload

Output

The output shows how the DataWeave (dw) format represents the XML input. Notice how the attributes from the XML input and the empty values are represented.

{
  users: {
    company: "MuleSoft",
    user @(name: "Leandro",lastName: "Shokida"): "",
    user @(name: "Mariano",lastName: "Achaval"): ""
  }
}

Example: Represent XML Namespaces in the DataWeave (dw) Format

This example maps XML namespaces to a canonical DataWeave representation, the application/dw format and MIME type.

Input

The XML serves as the input payload to the DataWeave source. Notice that the input contains XML namespaces.

<root>
    <h:table xmlns:h="http://www.w3.org/TR/html4/">
      <h:tr>
        <h:td>Apples</h:td>
        <h:td>Bananas</h:td>
      </h:tr>
    </h:table>

    <f:table xmlns:f="https://www.w3schools.com/furniture">
      <f:name>African Coffee Table</f:name>
      <f:width>80</f:width>
      <f:length>120</f:length>
    </f:table>
</root>

Source

The DataWeave script transforms the XML input payload to the DataWeave (dw) format and MIME type.

%dw 2.0
output application/dw
---
payload

Output

The output shows how the DataWeave (dw) format represents the XML input. Notice how the namespaces from the XML are represented.

ns h http://www.w3.org/TR/html4/
ns f https://www.w3schools.com/furniture
---
{
  root: {
      h#table: {
        h#tr: {
          h#td: "Apples",
          h#td: "Bananas"
        }
      },
      f#table: {
        f#name: "African Coffee Table",
        f#width: "80",
        f#length: "120"
      }
  }
}

Example: Create a CDATA Element

This example shows how to use the CData type to create a CDATA element in the XML output.

Source

The body of the DataWeave script coerces the String value to the CData type.

%dw 2.0
output application/xml
---
{
    test: "A text <a>" as CData
}

Output

The output encloses the input String value in a CDATA block, which contains the special characters, < and >, from the input.

<?xml version='1.0' encoding='UTF-8'?>
<test><![CDATA[A text <a>]]></test>

Example: Check for CDATA in a String

This example indicates whether a given String value is CDATA.

Input

The XML serves as the input payload to the DataWeave source. Notice that the test element contains a CDATA block.

<?xml version='1.0' encoding='UTF-8'?>
<test><![CDATA[A text <a>]]></test>

Source

The DataWeave script uses the is CData expression to determine whether the String value is CDATA.

%dw 2.0
output application/json
---
{
    test: payload.test is CDATA
}

Output

The JSON output contains the value true, which indicates tha the input String value is CDATA.

{
    "test": true
}

Example: Use the inlineCloseOn Writer Property

This example uses the inlineCloseOn writer property with the value none to act on the key-value pairs from the input.

Source

The DataWeave script transforms the body content to XML. Notice that values of the emptyElement keys are null.

%dw 2.0
output application/xml inlineCloseOn="none"
---
{
  someXml: {
    parentElement: {
      emptyElement1: null,
      emptyElement2: null,
      emptyElement3: null
    }
  }
}

Output

The emptyElement elements are empty. They do not contain the value null.

<?xml version='1.0' encoding='UTF-8'?>
<someXml>
  <parentElement>
    <emptyElement1></emptyElement1>
    <emptyElement2></emptyElement2>
    <emptyElement3></emptyElement3>
  </parentElement>
</someXml>

Example: Transforms Repeated JSON Keys to Repeated XML Elements

XML encodes collections using repeated (unbounded) elements. DataWeave represents unbounded elements by repeating the same key.

This example shows how to convert the repeated keys in a JSON array of objects into repeated XML elements.

Input

The JSON input serves as the payload to the DataWeave source. Notice that the name keys in the array are repeated.

{
  "friends": [
    {"name": "Mariano"},
    {"name": "Shoki"},
    {"name": "Tomo"},
    {"name": "Ana"}
  ]
}

Source

The DataWeave script selects the value of the friends key.

%dw 2.0
output application/xml
---
friends: {
    (payload.friends)
}

Output

The output represents the name keys as XML elements.

<?xml version='1.0' encoding='UTF-8'?>
<friends>
  <name>Mariano</name>
  <name>Shoki</name>
  <name>Tomo</name>
  <name>Ana</name>
</friends>

Example: Transform XML Elements to JSON and Replace Characters

This example iterates over an XML file that contains details of employees, such as the Id, Name, and Address, and converts the file into JSON format. The DataWeave script uses the replace function to iterate over each Address element and replace the characters - and / with blank space.

Input

The XML input serves as the payload to the DataWeave source. Notice that the Address element contains - and / characters.

<?xml version='1.0' encoding='UTF-8'?>
<root>
  <employees>
    <Id>1</Id>
    <Name>Mule</Name>
    <Address>MuleSoft Avenue - 123</Address>
  </employees>
  <employees>
    <Id>2</Id>
    <Name>Max</Name>
    <Address>MuleSoft Avenue-456/5/e</Address>
  </employees>
</root>

Source

The following DataWeave script iterates over the payload elements and performs a mapping to an object. The instruction payload01.Address replace /([\-\,\/])/ with " " replaces the - and / characters with blank spaces.

%dw 2.0
output application/json
---
payload.root.*employees map ((payload01 , indexOfPayload01) ->
  {
     Id: payload01.Id as String,
     Name: payload01.Name as String,
     Address: payload01.Address replace /([\-\,\/])/ with " "
  }
)

Output

The output represents the transformed XML elements into JSON.

[
  {
    "Id": "1",
    "Name": "Mule",
    "Address": "MuleSoft Avenue   123"
  },
  {
    "Id": "2",
    "Name": "Max",
    "Address": "MuleSoft Avenue 456 5 e"
  }
]

Example: Read DTD value

The following DataWeave script transforms the XML input payload that contains a DocType value into JSON. Notice that extracts the value of that doctype directive using the expression payload.^docType.

Input

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE cXML SYSTEM "http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd">
<cXML xml:lang="en-US" payloadID="1615232861.000506@prd327utl1.int.coupahost.com" timestamp="2021-03-08T11:47:42-08:00">
    <Header>
        <From>
            <Credential domain="NetworkID">
                <Identity>TestIdentity</Identity>
            </Credential>
        </From>
        <To>
            <Credential domain="TEST">
                <Identity>Identity</Identity>
            </Credential>
        </To>
    </Header>
</cXML>

Source

%dw 2.0
output application/json
---
{
    header: {
        senderId: payload.cXML.Header.From.Credential.Identity,
        receiverId: payload.cXML.Header.To.Credential.Identity,
        docType: payload.^docType
    }
}

Output

{
  "header": {
    "senderId": "Identity",
    "receiverId": "TestIdentity",
    "docType": {
      "rootName": "cXML",
      "systemId": "http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd"
    }
  }
}

Example: Write a DTD Value

This example shows how to use the docType metadata to create a DocType value in the XML output.

Source

%dw 2.0
output application/xml
ns xml http://www.w3.org/XML/1998/namespace

var cXmlObj = {
  cXML @(payloadID: "9949494", xml#lang: "en-US", timestamp: "2002-02-04T18:39:09-08:00"): {
      Header: {
        From: {
          Credential @(domain: "NetworkId"): {
            Identity: "Identity"
          }
        }
    }
  }
} as Object {
    docType: { rootName: "cXML", systemId: "http://xml.cXML.org/schemas/cXML/1.2.014/cXML.dtd" }
}
---
cXmlObj

Output

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE cXML SYSTEM "http://xml.cXML.org/schemas/cXML/1.2.014/cXML.dtd">
<cXML payloadID="9949494" xmlns:xml="http://www.w3.org/XML/1998/namespace" xml:lang="en-US" timestamp="2002-02-04T18:39:09-08:00">
  <Header>
    <From>
      <Credential domain="NetworkId">
        <Identity>Identity</Identity>
      </Credential>
    </From>
  </Header>
</cXML>

Example: Transform DTD Value to a String Representation

This example uses Dtd Module (dw::xml::Dtd) to transform a DocType value that includes a systemId to a string representation.

Input

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE cXML SYSTEM "http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd">
<cXML xml:lang="en-US" payloadID="1615232861.000506@prd327utl1.int.coupahost.com" timestamp="2021-03-08T11:47:42-08:00">
    <Header>
        <From>
            <Credential domain="NetworkID">
                <Identity>Identity</Identity>
            </Credential>
        </From>
        <To>
            <Credential domain="TEST">
                <Identity>Identity</Identity>
            </Credential>
        </To>
    </Header>
</cXML>

Source

%dw 2.0
output application/json
import * from dw::xml::Dtd
---
docTypeAsString(payload.^docType)

Output

"cXML SYSTEM http://xml.cxml.org/schemas/cXML/1.2.014/cXML.dtd"

Configuration Properties

DataWeave supports the following configuration properties for this format.

Reader Properties

This format accepts properties that provide instructions for reading input data.

Parameter Type Default Description

collectionPath

String

null

Sets the path to the location in the document where the collection is located. Accepts a path expression that identifies the location of the elements to stream.

externalEntities

Boolean

false

Indicates whether to process external entities. Disabled by default to avoid XML External Entity (XXE) attacks.

Valid values are true or false.

indexedReader

Boolean

true

Uses the indexed reader by default when reaching the threshold. Supports US-ASCII, UTF-8 and ISO-8859-1 encodings only. For other encodings, DataWeave uses the in-memory reader.

Valid values are true or false.

maxAttributeSize

Number

-1

Sets the maximum number of characters accepted in an XML attribute. Available since Mule 4.2.1.

maxEntityCount

Number

1

Sets the maximum number of entity expansions. The limit helps avoid Billion Laughs attacks.

nullValueOn

String

'blank'

Indicates whether to read an element with empty or blank text as a null value.

Valid values are empty or none or blank.

optimizeFor

String

'speed'

Configures the type of optimization for the XML parser to use.

Valid values are speed or memory.

streaming

Boolean

false

Streams input when set to true. Use only if entries are accessed sequentially. The input must be a top-level array. See the streaming example, and see DataWeave Readers.

Valid values are true or false.

supportDtd

Boolean

false

Enable or disable DTD support. Disabling skips (and does not process) internal and external subsets. You can also enable this property by setting the Mule system property com.mulesoft.dw.xml.supportDTD. Note that the default for this property changed from true to false in Mule version 4.3.0-20210601, which includes the June 2021 patch of DataWeave version 2.3.0.

Valid values are true or false.

Writer Properties

This format accepts properties that provide instructions for writing output data.

Parameter Type Default Description

bufferSize

Number

8192

Size of the buffer writer. The value must be greater than 8.

defaultNamespace

String

null

Specifies the default namespaces of the output XML.

deferred

Boolean

false

Generates the output as a data stream when set to true, and defers the script’s execution until the generated content is consumed.

Valid values are true or false.

doubleQuoteInDeclaration

Boolean

false

Escapes double quotes in the XML declaration when set to true.

Valid values are true or false.

encoding

String

null

The encoding to use for the output, such as UTF-8.

escapeCR

Boolean

false

Escapes CR characters when set to true.

Valid values are true or false.

escapeGT

Boolean

false

Escapes '>' characters when set to true.

Valid values are true or false.

indent

Boolean

true

Write indented output for better readability by default, or compress output into a single line when set to false.

Valid values are true or false.

inlineCloseOn

String

'empty'

Write an inline close tag, or explicitly open and close tags when the value is null.

Valid values are empty or none.

onInvalidChar

String

null

Valid values are base64 or ignore or none.

skipNullOn

String

null

Skips null values in the specified data structure. By default, DataWeave does not skip the values.

  • elements + Ignore and omit null elements inside XML output, for example, with output application/xml skipNullOn="arrays".

  • attributes ` + Ignore and omit `null attributes inside XML, for example, with output application/xml skipNullOn="objects".

  • everywhere + Apply skipNullOn to elements and attributes, for example, output application/xml skipNullOn="everywhere".

Valid values are elements or attributes or everywhere.

writeDeclaration

Boolean

true

Writes the XML header declaration when set to true.

Valid values are true or false.

writeDeclaredNamespaces

String

null

Marks the namespaces to declare in the root element of the XML:

  • All: Write all declared namespaces in the root element.

  • ids:<comma separated namespace id>: Write only the specified namespaces.

  • regex:<regex>: Write only the matching namespaces.

writeNilOnNull

Boolean

false

Writes the nil attribute for a null value when this property is set to true.

Valid values are true or false.

Supported MIME Types

This format supports the following MIME types.

MIME Type

*/xml

/+xml