視覚操作の設定

[Image] Read by (Url or Base64) 操作を設定します。

Image Read by (Url or Base64) 操作の設定

[Image] Read by (Url or Base64) 操作では、プロンプトに基づいて画像を読み取って解釈します。

[Image] Read by (Url or Base64) 操作を次のようなさまざまなシナリオで適用します。

画像分析

ビジネスレポート、プレゼンテーション、カスタマーサービスシナリオの画像を分析します。
コンテンツの生成

ブログ投稿、記事、ソーシャルメディアの画像について説明します。
ビジュアルインサイト

調査または設計プロジェクトの画像からインサイトを抽出します。

[Image] Read by (Url or Base64) 操作を設定する手順は、次のとおりです。

Anypoint Code Builder または Studio キャンバスで操作を選択します。
操作の [General (一般)] プロパティタブで、次の値を入力します。
- Prompt (プロンプト)
  
  操作のプロンプトを入力します。
- Image (画像)
  
  読み取る画像ファイルの URL または Base64 文字列を入力します。

この操作の XML を次に示します。

<ms-inference:read-image
  doc:id="dfbd1a61-6e98-4b5b-b77a-bfe031e70d45"
  config-ref="OpenAIConfig"
  doc:name="Read image">
    <ms-inference:prompt>
      <![CDATA[Describe what you see in this image in detail]]>
    </ms-inference:prompt>
    <ms-inference:image-url>
      <![CDATA[https://example.com/image.png]]>
    </ms-inference:image-url>
</ms-inference:read-image>

出力設定

この操作の応答には、メイン LLM 応答を含む JSON ペイロードが含まれます。応答の例を次に示します。

{
    "payload": {
        "response": "The image depicts the Eiffel Tower in Paris during a snowy day. The tower is partially covered in snow, and the surrounding trees and ground are also blanketed in snow. There is a pathway leading towards the Eiffel Tower, with a lamppost and some fencing along the sides. The overall scene has a serene and picturesque winter atmosphere."
    }
}

この操作では、メイン JSON ペイロード内に含まれない属性も返されます。これには、トークン利用状況に関する情報が含まれます。次に例を示します。

{
  "attributes": {
      "tokenUsage": {
          "inputCount": 267,
          "outputCount": 68,
          "totalCount": 335
      },
      "additionalAttributes": {
          "finish_reason": "stop",
          "model": "gpt-4o-mini",
          "id": "604ae573-8265-4dc0-b06e-457422f2fbd8"
      }
  }
}

tokenUsage: 属性として返されるトークン利用状況メタデータ
- inputCount: 入力の処理で使用されたトークン数
- outputCount: 出力の生成で使用されたトークン数
- totalCount: 入力と出力で使用されたトークンの合計数
additionalAttributes: LLM プロバイダーからの追加のメタデータ
- finish_reason: LLM 応答の完了理由
- model: 使用されたモデルの ID
- id: 要求の ID

視覚操作の設定

Image Read by (Url or Base64) 操作の設定

出力設定

関連情報