Choose your deployment mode:
Here’s a flowchart that breaks down how the decision process works:
flowchart TD
A{Deployment type?} --> B(Traditional mode)
A{Deployment type?} --> C(Hybrid mode)
A{Deployment type?} --> D(DB-less mode)
A{Deployment type?} --> E(Konnect DP)
B ---> F{Enough hardware to
run another cluster?}
C --> G(Upgrade CP first) & H(Upgrade DP second)
D ----> K([Rolling upgrade])
E ----> K
G --> F
F ---Yes--->I([Dual-cluster upgrade])
F ---No--->J([In-place upgrade])
H ---> K
click K "/gateway/upgrade/rolling/"
click I "/gateway/upgrade/dual-cluster/"
click J "/gateway/upgrade/in-place/"
Figure 1: Choose an upgrade strategy based on your deployment type. For Traditional mode, choose a dual-cluster upgrade if you have enough resources, or an in-place upgrade if you don’t have enough resources. For DB-less mode and Konnect DPs, use a rolling upgrade. For Hybrid mode, use one of the Traditional mode strategies for CPs, and the rolling upgrade for DPs.
See the following sections for breakdowns of each strategy.
A Traditional mode deployment is when all Kong Gateway components are running in one environment,
and there is no Control Plane/Data Plane separation.
You have two options when upgrading Kong Gateway in Traditional mode:
-
Dual-cluster upgrade:
A new Kong Gateway cluster of version Y is deployed alongside the current version X, so that two
clusters serve requests concurrently during the upgrade process.
-
In-place upgrade: An in-place upgrade reuses
the existing database and has to shut down cluster X first, then configure the new cluster Y to point
to the database.
We recommend using a dual-cluster upgrade if you have the resources to run another cluster concurrently.
Use the in-place method only if resources are limited, as it will cause business downtime.
Upgrading Kong Gateway from one LTS version to another LTS version with zero downtime can be achieved through a dual-cluster upgrade strategy.
This approach involves setting up a new cluster running the upgraded version of Kong Gateway alongside the existing cluster running the current version.
At a high level, the process typically involves the following steps:
-
Provisioning a same-size deployment: You need to ensure that the new cluster, which will run the upgraded version of Kong Gateway, has the same capacity and resources as the existing cluster. This ensures that both clusters can handle the same amount of traffic and workload.
-
Setting up dual-cluster deployment: Once the new cluster is provisioned, you can start deploying your APIs and configurations to both clusters simultaneously. The dual cluster deployment allows both the old and new clusters to coexist and process requests in parallel.
-
Data synchronization: During the dual cluster deployment, data synchronization is crucial to ensure that both clusters have the same data. This can involve migrating data from the old cluster to the new one or setting up a shared data storage solution to keep both clusters in sync. Import the database from the old cluster to the new cluster by using a snapshot or pg_restore
.
-
Traffic rerouting: As the new cluster is running alongside the old one, you can start gradually routing incoming traffic to the new cluster. This process can be done gradually or through a controlled switchover mechanism to minimize any impact on users. This can be achieved by any load balancer, like DNS, Nginx, F5, or even a Kong Gateway node with Canary plugin enabled.
-
Testing and validation: Before performing a complete switchover to the new cluster, it is essential to thoroughly test and validate the functionality of the upgraded version. This includes testing APIs, plugins, authentication mechanisms, and other functionalities to ensure they are working as expected.
-
Complete switchover: Once you are confident that the upgraded cluster is fully functional and stable, you can redirect all incoming traffic to the new cluster. This step completes the upgrade process and decommissions the old cluster.
By following this dual cluster deployment strategy, you can achieve a smooth and zero-downtime upgrade from one LTS version of Kong Gateway to another. This approach helps ensure high availability and uninterrupted service for your users throughout the upgrade process.
While an in-place upgrade allows you to perform the upgrade on the same infrastructure,
it does require some downtime during the actual upgrade process.
Plan a suitable maintenance or downtime window during which you can perform the upgrade.
During this period, the Kong Gateway will be temporarily unavailable.
For scenarios where zero downtime is critical, consider the dual-cluster upgrade method,
keeping in mind the additional resources and complexities.
In DB-less mode, each independent Kong Gateway node loads a copy of declarative Kong Gateway
configuration data into memory without persistent database storage, so failure of some nodes doesn’t spread to other nodes.
Deployments in this mode should use the rolling upgrade strategy.
You could parse the validity of the declarative YAML contents with version Y, using the deck gateway validate
or the kong config parse
command.
You must back up your current kong.yaml
file before starting the upgrade.
Hybrid mode deployments consist of one or more Control Plane (CP) nodes, and one or more Data Plane (DP) nodes.
CP nodes use a database to store Kong Gateway configuration data, whereas DP nodes don’t, since they get all of the needed information from the CP.
The recommended upgrade process is a combination of different upgrade strategies for each type of node, CP or DP.
The major challenge with a Hybrid mode upgrade is the communication between the CP and DP.
As Hybrid mode requires the minor version of the CP to be no less than that of the DP, you must upgrade CP nodes before DP nodes.
The upgrade must be carried out in two phases:
- Upgrade the CP according to the recommendations in the section Traditional mode,
while DP nodes are still serving API requests.
- Upgrade DP nodes using the recommendations from the section DB-less mode.
Point the new DP nodes to the new CP to avoid version conflicts.
The role decoupling feature between CP and DP enables DP nodes to serve API requests while upgrading CP.
With this method, there is no business downtime.
Custom plugins (either your own plugins or third-party plugins that are not shipped with Kong Gateway)
need to be installed on both the Control Plane and the Data Planes in Hybrid mode.
Install the plugins on the Control Plane first, and then the Data Planes.
See the following sections for a breakdown of the options for Hybrid mode deployments.
CP nodes must be upgraded before DP nodes. CP nodes serve an admin-only role and require database support.
You can select from the same upgrade strategies nominated for Traditional mode (dual-cluster or in-place),
as described in figure 2 and figure 3 respectively.
Upgrading the CP nodes using the dual-cluster strategy:
flowchart TD
DBA[(Current
database)]
DBB[(New
database)]
CPX(Current Control Plane X)
Admin(No admin
write operations)
CPY(New Control Plane Y)
DPX(fa:fa-layer-group Current Data Plane X nodes)
API(API requests)
DBA -.- CPX -."DP connects to either \nCP X...".- DPX
Admin -.X.- CPX & CPY
DBB --pg_restore--- CPY -."...OR to CP Y".- DPX
API--> DPX
style API stroke:none!important,fill:none!important
style DBA stroke-dasharray:3
style CPX stroke-dasharray:3
style Admin fill:none!important,stroke:none!important,color:#d44324!important
linkStyle 2,3 stroke:#d44324!important,color:#d44324!important
Figure 2: The diagram shows a CP upgrade using the dual-cluster strategy.
The new CP Y is deployed alongside the current CP X, while current DP nodes X are still serving API requests.
Upgrading the CP nodes using the in-place strategy:
flowchart
DBA[(Database)]
CPX(Current Control Plane X \n #40;inactive#41;)
Admin(No admin \n write operations)
CPY(New Control Plane Y)
DPX(fa:fa-layer-group Current Data Plane X nodes)
API(API requests)
DBA -..- CPX -."DP connects to either \nCP X...".- DPX
Admin -.X.- CPX & CPY
DBA --"kong migrations up \n kong migrations finish"--- CPY -."...OR to CP Y".- DPX
API--> DPX
style API stroke:none!important,fill:none!important
style CPX stroke-dasharray:3
style Admin fill:none!important,stroke:none!important,color:#d44324!important
linkStyle 2,3 stroke:#d44324!important,color:#d44324!important
Figure 3: The diagram shows a CP upgrade using the in-place strategy, where the current CP X is directly replaced by a new CP Y.
The database is reused by the new CP Y, and the current CP X is shut down once all nodes are migrated.
From the two diagrams, you can see that DP nodes X remain connected to the current CP node X, or alternatively switch to the new CP node Y.
Kong Gateway guarantees that new minor versions of CPs are compatible with old minor versions of the DP,
so you can temporarily point DP nodes X to the new CP node Y.
This lets you pause the upgrade process if needed, or conduct it over a longer period of time.
This setup is meant to be temporary, to be used only during the upgrade process.
We do not recommend running a combination of new versions of CP nodes and old versions of DP nodes in a long-term production deployment.
After the CP upgrade, cluster X can be decommissioned. You can delay this task to the very end of the DP upgrade.
Once the CP nodes are upgraded, you can move on to upgrade the DP nodes.
The only supported upgrade strategy for DP upgrades is the rolling upgrade.
The following diagrams, figure 4 and 5, are the counterparts of figure 2 and 3 respectively.
Using the dual-cluster strategy with a
rolling upgrade workflow:
flowchart TD
DBX[(Current \n database)]
DBY[(New \n database)]
CPX(Current Control Plane X)
CPY(New Control Plane Y)
DPX(Current Data Planes X)
DPY(New Data Planes Y)
API(API requests)
LB(Load balancer)
Admin(No admin \n write operations)
Admin2(No admin \n write operations)
subgraph A [ ]
Admin -.X.- CPX
DBX -.- CPX
DBY --- CPY
CPX -."Current DP connects to \neither CP X...".- DPX
Admin2 -.X.- CPY
CPY -."...OR to CP Y".- DPX
DPX -.90%..- LB
CPY --- DPY --10%---- LB
end
subgraph B [ ]
API --> LB & LB & LB
end
linkStyle 0,4 stroke:#d44324!important,color:#d44324!important
linkStyle 8,9 stroke:#b6d7a8!important
style CPX stroke-dasharray:3
style DPX stroke-dasharray:3
style DBX stroke-dasharray:3
style API stroke:none!important,fill:none!important
style A stroke:none!important,display:none!important
style B stroke:none!important,display:none!important
style Admin fill:none!important,stroke:none!important,color:#d44324!important
style Admin2 fill:none!important,stroke:none!important,color:#d44324!important
Figure 4: The diagram shows a DP upgrade using the dual-cluster and rolling strategies.
The new CP Y is deployed alongside with the current CP X, while current DP nodes X are still serving API requests.
In the image, the background color of the current database and CP X is grey instead of white, signaling that the old CP is already upgraded and might have been decommissioned.
Using the in-place strategy
strategy with a rolling upgrade workflow:
flowchart
DBA[(Database)]
CPX(Current Control Plane X \n #40;inactive#41;)
CPY(New Control Plane Y)
DPX(Current Data Planes X)
DPY(New Data Planes Y)
API(API requests)
LB(Load balancer)
Admin(No admin \n write operations)
Admin2(No admin \n write operations)
subgraph A [ ]
Admin -.X.- CPX
DBA -.X.- CPX
DBA --- CPY
CPX -."Current DP connects to \neither CP X...".- DPX
Admin2 -.X.- CPY
CPY -."OR to CP Y".- DPX -.90%..- LB
CPY --- DPY --10%---- LB
end
subgraph B [ ]
API --> LB & LB & LB
end
linkStyle 0,1,4 stroke:#d44324!important,color:#d44324!important
linkStyle 8,9 stroke:#b6d7a8!important
style CPX stroke-dasharray:3,stroke:#c1c6cdff!important
style DPX stroke-dasharray:3
style A stroke:none!important,color:#fff!important
style B stroke:none!important,color:#fff!important
style Admin fill:none!important,stroke:none!important,color:#d44324!important
style Admin2 fill:none!important,stroke:none!important,color:#d44324!important
Figure 5: The diagram shows a DP upgrade using the in-place and rolling strategies.
The diagram shows that the database is reused by the new CP Y, while current DP nodes X are still serving API requests.