<ms-inference:read-image
doc:id="dfbd1a61-6e98-4b5b-b77a-bfe031e70d45"
config-ref="OpenAIConfig"
doc:name="Read image">
<ms-inference:prompt>
<![CDATA[Describe what you see in this image in detail]]>
</ms-inference:prompt>
<ms-inference:image-url>
<![CDATA[https://example.com/image.png]]>
</ms-inference:image-url>
</ms-inference:read-image>
Configuring Vision Operations
Configure the [Image] Read by (Url or Base64) operation.
Configure the Image Read by (Url or Base64) Operation
The [Image] Read by (Url or Base64) operation reads and interprets an image based on a prompt.
Apply the [Image] Read by (Url or Base64) operation in various scenarios, such as for:
-
Image Analysis
Analyze images in business reports, presentations, or customer service scenarios.
-
Content Generation
Describe images for blog posts, articles, or social media.
-
Visual Insights
Extract insights from images in research or design projects.
To configure the [Image] Read by (Url or Base64) operation:
-
Select the operation on the Anypoint Code Builder or Studio canvas.
-
In the General properties tab for the operation, enter these values:
-
Prompt
Enter the prompt for the operation.
-
Image
Enter the URL or Base64 String of the image file that is to be read.
-
-
In the Additional Request Attributes (optional) field, you can pass any additional request attributes in the request payload to generate more relevant and precise LLM outputs.
This is the XML for this operation:
Output Configuration
This operation responds with a JSON payload containing the main LLM response. This is an example response:
{
"payload": {
"response": "The image depicts the Eiffel Tower in Paris during a snowy day. The tower is partially covered in snow, and the surrounding trees and ground are also blanketed in snow. There is a pathway leading towards the Eiffel Tower, with a lamppost and some fencing along the sides. The overall scene has a serene and picturesque winter atmosphere."
}
}
The operation also returns attributes that aren’t within the main JSON payload, that include information about token usage, for example:
{
"attributes": {
"tokenUsage": {
"inputCount": 267,
"outputCount": 68,
"totalCount": 335
},
"additionalAttributes": {
"finish_reason": "stop",
"model": "gpt-4o-mini",
"id": "604ae573-8265-4dc0-b06e-457422f2fbd8"
}
}
}
-
tokenUsage
: Token usage metadata returned as attributes-
inputCount
: Number of tokens used to process the input -
outputCount
: Number of tokens used to generate the output -
totalCount
: Total number of tokens used for input and output
-
-
additionalAttributes
: Additional metadata from the LLM provider-
finish_reason
: The finish reason for the LLM response -
model
: The ID of the model used -
id
: The ID of the request
-
For Gemini Inference, additional parameters must be included within the GenerationConfig property of the request payload.
|