Look Up Data in an Excel (XLSX) File

DataWeave 2.2 is compatible and bundled with Mule 4.2. Standard Support for Mule 4.2 ended on May 2, 2021, and this version of Mule will reach its End of Life on May 2, 2023, when Extended Support ends.

Deployments of new applications to CloudHub that use this version of Mule are no longer allowed. Only in-place updates to applications are permitted.

MuleSoft recommends that you upgrade to the latest version of Mule 4 that is in Standard Support so that your applications run with the latest fixes and security enhancements.

This DataWeave example uses the filter function to return only the rows in an Excel (XLSX) input file that contain a specified value. Before you begin, note that DataWeave version 2 (%dw 2.0) is for Mule 4 apps. For a Mule 3 app, refer to DataWeave 1.0 (%dw 1.0) examples, within the Mule 3.9 documentation set. For other Mule versions, you can use the Mule Runtime version selector in the table of contents.

By default, files over 1.5 MB are stored on disk. Smaller files are stored in memory.

The following DataWeave script reads an XLSX file and returns filtered data from it. It assumes that a spreadsheet named ourBugs.xlsx contains data on bugs assigned to all the members of a team, including an assignee named Fred M.

DataWeave Script:
%dw 2.0
var myInput = readUrl("classpath://ourBugs.xlsx", "application/xlsx")
output application/json
---
myInput."Data" filter ((entry, index) -> entry."Assignee" == "Fred M")
  • The script passes classpath:ourBugs.xlsx to the readUrl function to read the file from a Studio project directory (src/main/resources). It stores the results in the variable myInput.

  • The script selects a sheet named Data from the XLSX file, then filters out all records except the ones where the Assignee column contains the value Fred M. It returns the results in an array of JSON objects, for example:

    Sample Output:
    [
      {
        "Issue Key": "BUG-11708",
        "Issue Type": "Bug",
        "Summary": "Some Description of the Bug",
        "Assignee": "Fred M",
        "Reporter": "Natalie C",
        "Priority": "To be reviewed",
        "Status": "Closed",
        "Resolution": "Done",
        "Created": "2019-04-29T03:57:00",
        "Updated": "2019-05-06T10:40:00",
        "Due Date": ""
      },
      {
        "Issue Key": "BUG-4903",
        "Issue Type": "Story",
        "Summary": "Some Description of the Bug",
        "Assignee": "Fred M",
        "Reporter": "Fred M",
        "Priority": "To be reviewed",
        "Status": "In Progress",
        "Resolution": "",
        "Created": "2019-05-07T11:22:00",
        "Updated": "2019-05-08T10:16:00",
        "Due Date": ""
      },
      {
        "Issue Key": "BUG-4840",
        "Issue Type": "Story",
        "Summary": "Some Description of the Bug",
        "Assignee": "Fred M",
        "Reporter": "Pablo C",
        "Priority": "To be reviewed",
        "Status": "In Validation",
        "Resolution": "",
        "Created": "2019-04-30T07:11:00",
        "Updated": "2019-05-08T10:16:00",
        "Due Date": ""
      }
    ]

Using the same DataWeave script as in the previous example, the next example writes the results of the filter expression to a file, fredBugs.json. The example is a configuration XML from a Mule project in Studio.

XML Configuration in Studio:
<file:config name="File_Read_Config" doc:name="File Read Config" />
<file:config name="File_Write_Config" doc:name="File Write Config" />
<flow name="xlsx-lookup" >
  <scheduler doc:name="Scheduler" >
    <scheduling-strategy >
      <fixed-frequency frequency="1" timeUnit="MINUTES"/>
    </scheduling-strategy>
  </scheduler>
  <ee:transform doc:name="Transform Message" >
    <ee:message >
      <ee:set-payload ><![CDATA[%dw 2.0
var myInput = readUrl("classpath://ourBugs.xlsx", "application/xlsx")
output application/json
---
myInput."Data" filter ((entry, index) -> entry."Assignee" == "Fred M")]]></ee:set-payload>
    </ee:message>
  </ee:transform>
  <file:write doc:name="Write JSON"
              path="/path/to/fredBugs.json"
              config-ref="File_Write_Config">
</flow>
  • The Scheduler (scheduler) triggers the flow to execute the next component, Transform Message.

  • The Transform Message component (ee:transform) provides a DataWeave script to return all records from the "Data" sheet for which the Assignee column contains the value Fred M, and it transforms the binary XLSX input to JSON output.

  • The Write operation (file:write) from the File connector receives the JSON payload from ee:transform and writes it to a file called fredBugs.json.

Was this article helpful?

💙 Thanks for your feedback!

Edit on GitHub