Load balancing: Priorityv3.8+

Configure the plugin to use three OpenAI models and create priority groups based on their respective weights.

In this example, the GPT-4 model and the GPT-4o-mini model form a group, and the GPT-3 model forms another group.

Since the first group has a weight of 70 and the second one a weight of 25, the plugin will first try to route requests to GPT-4 or GPT-4o-mini. If both fail, the plugin will choose the GPT-3 model.

Prerequisites

  • An OpenAI account

Environment variables

  • OPENAI_API_KEY: The API key to use to connect to OpenAI.

Set up the plugin

Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!
OSZAR »