Contact Free trial Login

Query a Database - Mule 4

When you configure a Database operation using the Database connector, there are several ways to add variable values to the SQL statement you execute in the database.

Use Input Parameters to Protect Database Queries

The select operation is used to retrieve information from the RDBMS. The primary concept of this operation is to supply a SQL query and use DataWeave to supply the parameters:

<flow name="selectParameterizedQuery">
  <db:select config-ref="dbConfig">
    <db:sql>SELECT * FROM PLANET WHERE name = :name</db:sql>
    <db:input-parameters>
      #[{'name' : payload}]
    </db:input-parameters>
  </db:select>
</flow>

As you can see in the above example, input parameters are supplied as key-value pairs, which you can create by embedding a DataWeave script. Those keys are used in conjunction with the semicolon character (:) to reference a parameter value by name. This is the recommended approach for using parameters in your query.

Since version 1.4.0 (escaping semi-colons): If you need to use the semicolon character (:) in your SQL Query, you can escape it by putting a backslash before it. This is useful, for example, when using PostgeSQL type casting, which requires two semicolons before the type you are casting to, for example: <db:sql>SELECT price\:\:float8 FROM PRODUCT</db:sql>

The alternative is to directly write ` <db:sql>SELECT * FROM PLANET WHERE name = #[payload] </db:sql>`, but this is a very dangerous practice that is not recommended.

Advantages of using input parameters to configure the WHERE clause in a SELECT statement this way:

  • The query becomes immune to SQL injection attacks.

  • The connector can perform optimizations that are not possible otherwise, which improves the app’s overall performance.

DataSense is available for the operation’s input and output. By analyzing the query, the connector will automatically calculate the structure of the query’s output by analyzing the projection section of the SQL statement. At the same time, by comparing the conditions in the WHERE clause to the table structure, it will also generate DataSense input to help you build the DataWeave script that will generate the input parameters.

Dynamic Queries

Sometimes, you not only need to parameterize the WHERE clause, but also parameterize parts of the query itself. Examples of use cases for this are queries that need to hit online versus historic tables that depend on a condition, or complex queries where the project columns need to vary.

In this example, you can see how a full expression is used to produce the query by building a string in which the table depends on a variable $(vars.table). An important thing to notice is that although some of the query text is dynamic "SELECT * FROM $(vars.table), the WHERE clause is still using the best practice of defining the WHERE condition using input parameters, in this case WHERE name = :name:

<set-variable variableName="table" value="PLANET"/>
<db:select config-ref="dbConfig">
    <db:sql>#["SELECT * FROM $(vars.table) WHERE name = :name"]</db:sql>
    <db:input-parameters>
        #[{'name' : payload}]
    </db:input-parameters>
</db:select>
In Mule 3, the concept of select was split in parameterized and dynamic queries, and you couldn’t use both at the same time. You had to choose between having a dynamic query or having the advantages of using parameters (SQL Injection protection, PreparedStatement optimization, and so on). Furthermore, the syntax to do one or the other was different, so you had to learn two different ways of doing the same thing. But with the Database Connector in Mule 4, you can now use both methods at the same time by using expressions in the query.

Why do I need dynamic queries at all for the example above? Can I just treat the table like another Input parameter? The answer is no. Input parameters can only be applied to parameters in a WHERE clause. To modify any other part of the query, you need to use DataWeave’s interpolation operator.

Stream Large Results

Database tables tend to be big. One single query might return tens of thousands of records, especially when dealing with integration use cases. Streaming is a great solution for this. What does streaming mean? Suppose you have a query which returns 10K rows, attempting to fetch all those rows at once will result in the following:

  • Performance degradation, since that’s a big pull from the network

  • A risk of running out of memory, since all that information needs to be loaded into RAM.

Streaming means that the connector will not fetch the 10K rows at once; instead, it will fetch a smaller chunk, and once that chunk has been consumed it will go fetch the rest. That way, you can reduce pressure over the network and memory.

Stream in Mule 4 versus Stream in Mule 3

In Mule 3.x, streaming was something you had to specifically enable because it was disabled by default. In Mule 4, streaming is transparent and always enabled. You don’t have to worry about it anymore. You can simply trust that the feature is there.

Another improvement from Mule 3 is that you can now use the new repeatable streams mechanism from Mule 4. This means that streams are now repeatable, and you can have DataWeave and other components process the same stream many times, even in parallel. For more information on repeatable streams, see Streaming in Mule 4.0.

Limit Results

Mule runtime allows the connector to handle streaming gracefully. However, that does not mean that it’s a good idea to move large chunks of data from the database to Mule. Even with streaming, a typical SQL query can return many rows, each one containing a lot of information. This generates:

  • Large network overhead

  • A large resources footprint on the database server

  • Large memory and requirements in the Mule Runtime

The Select operation provides parameters (fetchSize and maxRows) to help with this:

<db:select fetchSize="200" maxRows="1000" config-ref="dbConfig">
  <db:sql>select * from some_table</db:sql>
</db:select>

This example tells the connector not to fetch more that 1000 total rows as the result of an SQL SELECT query, but to try to only stream results from the database to the Database connector 200 at a time. So in this example, it would require 5 separate network round trips to retrieve 1000 matching rows.

The combination limits the total amount of information that will be retrieved (the maxRows value) and guarantees that the data is returned from the database over the network in smaller chunks (the fetchSize value).

The fetchSize is a hint to the JDBC driver and need not always be enforced. The behavior varies between JDBC driver providers. Often, the default fetchSize for a JDBC driver is 10.

Set a Query Timeout

Query Timeout Reference

Sometimes database queries take a long time to execute. The following factors often cause delays in query execution:

  • An inefficient query, such as one having improper indexing that iterates over many rows

  • A busy RDBMS or network

  • A lock contention

Generally, it’s recommended to set a timeout on the query. To manage timeouts, configure queryTimeout and queryTimeoutUnit. The following example shows how to set a timeout for the Select operation, but all operations support setting a timeout:

<db:select queryTimeout="0" queryTimeoutUnit="SECONDS" config-ref="dbConfig">
   <db:sql>select * from some_table</db:sql>
</db:select>

Was this article helpful?

💙 Thanks for your feedback!

Edit on GitHub