Contact Us 1-800-596-4880

Creating an LLM Proxy

You can configure the LLM Proxy to use different models and different routes.

A large Flex Gateway supports up to 50 LLM Proxies.

Before You Begin

  1. Deploy a Flex Gateway version 1.11.4 or later where you want to deploy your LLM Proxy.

  2. Ensure you have the API Manager API Creator permission.

  3. Retrieve your API keys from your LLM Providers.

  4. Configure a semantic service if you want to use semantic routing.

Create an LLM Proxy

  1. From API Manager, click LLM Proxies.

  2. Click + Add LLM Proxy.

  3. Configure the Inbound Endpoint of the LLM Proxy:

    1. Define a LLM Proxy Name.

    2. Select an endpoint Format:

      • OpenAI: Select the OpenAI API format to send requests to all supported LLM Providers (including Gemini).

      • Gemini: Select the Gemini API format to send requests to only Gemini.

    3. Define a Base path.

    4. Select Advanced options if necessary.

    5. Click Next.

  4. Select a Flex Gateway to deploy the server instance to from Select a gateway.

  5. Configure the routes that comprise the Outbound Endpoint:

    1. Select your LLM Provider.

    2. Ensure the URL for your provider is correct. Edit if necessary.

    3. Configure access details for the provider endpoint.

    4. Select a Static or Dynamic API Key. If selecting Dynamic API Key, define a DataWeave script to extract the API Key from the incoming request.

    5. Select a Target Model to override the model version specified in the payload. Selecting Not Applicable sends the request to the specified model. A Target Model is required for semantic routing.

      To configure a target model for Amazon Bedrock Claude Modes, you must enter the provider and model ID formatted as [provider_prefix]/[internal_model_id].

      To learn how to find the model ID, see Amazon Bedrock Model Names.

    6. Click Add LLM Route to add additional routes. Complete the previous steps to configure the new route. Each LLM Provider supports one route.

  6. If adding multiple routes, select a Routing strategy. To configure your routing strategy, see:

  7. Click Save & Deploy.

Configure Model-Based Routing

  1. Configure multiple routes. Click Add LLM Route to create new routes.

  2. Select Model-based for Routing strategy.

  3. Choose to enable a Fallback route for the request to be sent to if the provider or model is incorrectly sepcified. If enabling a fallback route:

    1. Select a Route to fallback to.

    2. Select a target model for the fallback route to use.

  4. If no fall back route is configured and a route fails, a error response is returned.

  5. Return to Create an LLM Proxy step 7 to finish configuring your LLM Proxy.

Configure Semantic Routing

To configure semantic routing:

  1. Make sure you have already Configured a semantic service.

  2. Configure multiple routes and select a target model for each route. Click Add LLM Route to create new routes.

  3. Select Semantic for Routing strategy.

  4. Click Select a service and select a service.

  5. Define or select a prompt topic for the routes:

    • Advanced scale semantic service:

      1. Select prompt topics from your predefined prompt topics.

    • Basic scale semantic service:

      1. Click the Select prompt topics.

      2. Click + Create prompt topic.

      3. Define a Prompt topic name.

      4. Define a Prompt utterances or click Upload utterances to upload a plain text file containing your prompt utterances.

      5. Click Create.

      6. Create multiple prompt topics for each route as needed.

  6. Configure a Fallback route for the request to be sent to if it doesn’t match a semantic route:

    1. Specify an accuracy threshold. When the accuracy of the semantic match is less than this threshold, traffic is sent to the fallback route.

    2. Select a Route to fallback to.

    3. Select a Target model for the fallback route to use.

  7. Create a Semantic prompt guard to block users from asking the server about specific topics:

    • Advanced scale semantic service:

      1. Select topics from your predefined prompt topics.

    • Basic scale semantic service:

      1. Click + Create deny list.

      2. Define a Prompt topic name.

      3. Define prompt utterances or click Upload utterances to upload a plain text file containing your prompt utterances.

      4. Click Create.

      5. Create multiple prompt topics for each route as needed.

        Creating a semantic prompt guard automatically applies the Semantic Prompt Guard policy.
  8. Return to Create an LLM Proxy step 7 to finish configuring your LLM Proxy.

Edit and Delete an LLM Proxy

To edit an LLM Proxy:

  1. From API Manager, click LLM Proxies.

  2. Click the name of the LLM Proxy you want to edit.

  3. Click Configuration.

  4. Switch between the Inbound, Gateway, and Outbound configurations to make the necessary edits.

  5. Click Save & Deploy.

To delete an LLM Proxy:

  1. From API Manager, click LLM Proxies.

  2. Click the three-dots menu (3%) of the LLM Proxy you want to delete.

  3. Click Delete LLM Proxy.

  4. Click Yes, Delete.