name,lastname,age,gender Mariano,de Achaval,37,male Paula,de Estrada,37,female
csv
CSV Format
MIME type: application/csv
ID: csv
The CSV data format is represented as a DataWeave array of objects in which each object represents a row. All simple values are represented as strings.
The DataWeave reader for CSV input supports the following parsing strategies:
-
Indexed
-
In-Memory
-
Streaming
By default, the CSV reader stores input data from an entire file in-memory if the file is 1.5MB or less. If the file is larger than 1.5 MB, the process writes the data to disk. For very large files, you can improve the performance of the reader by setting a streaming property to true.
For additional details, see DataWeave Readers.
Examples
The following examples show uses of the CSV format.
Example: Represent CSV Data
The following example shows how DataWeave represents CSV data.
Input
The following sample data serves as input for the DataWeave source.
Source
The DataWeave script transforms the CSV input payload to the DataWeave (dw) format and MIME type.
%dw 2.0
output application/dw
---
payload
dataweave
Output
The DataWeave script produces the following output.
[
{
name: "Mariano",
lastname: "de Achaval",
age: "37",
gender: "male"
},
{
name: "Paula",
lastname: "de Estrada",
age: "37",
gender: "female"
}
]
dataweave
Example: Stream CSV Data
By default, the CSV reader stores input data from an entire file in-memory
if the file is 1.5MB or less. If the file is larger than 1.5 MB, the process
writes the data to disk. For very large files, you can improve the performance
of the reader by setting a streaming
property to true
. To demonstrate the use of this property, the next example streams a CSV file and transforms it to JSON.
Input
The structure of the CSV input looks something like the following. Note that a streamed file is typically much longer.
street,city,zip,state,beds,baths,sale_date 3526 HIGH ST,SACRAMENTO,95838,CA,2,1,Wed May 21 00:00:00 EDT 2018 51 OMAHA CT,SACRAMENTO,95823,CA,3,1,Wed May 21 00:00:00 EDT 2018 2796 BRANCH ST,SACRAMENTO,95815,CA,2,1,Wed May 21 00:00:00 EDT 2018 2805 JANETTE WAY,SACRAMENTO,95815,CA,2,1,Wed May 21 00:00:00 EDT 2018 6001 MCMAHON DR,SACRAMENTO,95824,CA,2,1,,Wed May 21 00:00:00 EDT 2018 5828 PEPPERMILL CT,SACRAMENTO,95841,CA,3,1,Wed May 21 00:00:00 EDT 2018
csv
XML Configuration
To demonstrate a use of the streaming
property, the following Mule flow streams a CSV file and transforms it to JSON.
<flow name="dw-streamingFlow" >
<scheduler doc:name="Scheduler" >
<scheduling-strategy >
<fixed-frequency frequency="1" timeUnit="MINUTES"/>
</scheduling-strategy>
</scheduler>
<file:read
path="${app.home}/input.csv"
config-ref="File_Config"
outputMimeType="application/csv; streaming=true; header=true"/>
<ee:transform doc:name="Transform Message" >
<ee:message >
<ee:set-payload ><![CDATA[%dw 2.0
output application/json
---
payload map ((row) -> {
zipcode: row.zip
})]]></ee:set-payload>
</ee:message>
</ee:transform>
<file:write doc:name="Write"
config-ref="File_Config1"
path="/path/to/output/file/output.json"/>
<logger level="INFO" doc:name="Logger" message="#[payload]"/>
</flow>
xml
-
The example configures the Read operation (
<file:read/>
) to stream the CSV input by settingoutputMimeType="application/csv; streaming=true"
. The input CSV file is located in the project directory,src/main/resources
, which is the location of${app.home}
. -
The DataWeave script in the Transform Message component uses the
map
function to iterate over each row in the CSV payload and select the value of each field in thezip
column. -
The Write operation returns a file,
output.json
, which contains the result of the transformation. -
The Logger prints the same output payload that you see in
output.json
.
Output
The CSV streaming example produces the following output.
[
{
"zipcode": "95838"
},
{
"zipcode": "95823"
},
{
"zipcode": "95815"
},
{
"zipcode": "95815"
},
{
"zipcode": "95824"
},
{
"zipcode": "95841"
}
]
json
Configuration Properties
DataWeave supports the following configuration properties for this format.
Reader Properties
This format accepts properties that provide instructions for reading input data.
Parameter | Type | Default | Description |
---|---|---|---|
|
|
|
Line number on which the body starts. |
|
|
|
Character to use for escaping special characters, such as separators or quotes. |
|
|
|
Indicates whether a CSV header is present.
Valid values are |
|
|
|
Line number on which the CSV header is located. |
|
|
|
Indicates whether to ignore an empty line. Valid values are |
|
|
|
Character to use for quotes. |
|
|
|
Character that separates one field from another field. |
|
|
|
Streams input when set to Valid values are |
Writer Properties
This format accepts properties that provide instructions for writing output data.
Parameter | Type | Default | Description |
---|---|---|---|
|
|
|
Line number on which the body starts. |
|
|
|
Size of the buffer writer, in bytes. The value must be greater than |
|
|
|
Generates the output as a data stream when set to Valid values are |
|
|
|
The encoding to use for the output, such as UTF-8. |
|
|
|
Character to use for escaping special characters, such as separators or quotes. |
|
|
|
Indicates whether a CSV header is present.
Valid values are |
|
|
|
Line number on which the CSV header is located. |
|
|
|
Indicates whether to ignore an empty line. Valid values are |
|
|
|
Line separator to use when writing CSV, for example, |
|
|
|
Character to use for quotes. |
|
|
|
Quotes header values when set to Valid values are |
|
|
|
Quotes every value when set to Valid values are |
|
|
|
Character that separates one field from another field. |
Supported MIME Types
This format supports the following MIME types.
MIME Type |
---|
|