Zipkin

Overview Get started Configuration reference Changelog

Reporter

Tracing data is reported to another system using an OpenTracing reporter. This plugin records tracing data for a given request, and sends it as a batch to a Zipkin server using the Zipkin v2 API. Zipkin version 1.31 or later is required.

The config.http_endpoint configuration variable must contain the full URI including scheme, host, port, and path sections (for example, your URI likely ends in /api/v2/spans).

Spans

The plugin does request sampling. For each request that triggers the plugin, a random number between 0 and 1 is chosen.

If the number is smaller than the configured config.sample_ratio, then a trace with several spans will be generated. If config.sample_ratio is set to 1, then all requests will generate a trace (this might be very noisy).

For each request that gets traced, the following spans are produced: request, proxy, and balancer.

Request span

Span kind: SERVER

There is one request span per request, which encompasses the whole request in Kong Gateway.

The proxy and balancer spans are children of this span. It contains the following logs/annotations for the rewrite phase:

krs: kong.rewrite.start
krf: kong.rewrite.finish

The request span has the following tags:

Tag	Description
`lc`	Hardcoded to `kong`.
`kong.service`	The UUID of the Gateway Service matched when processing the request, if any.
`kong.service_name`	The name of the Gateway Service matched when processing the request, if Gateway Service exists and has a `name` attribute.
`kong.route`	The UUID of the Route matched when processing the request, if any (it can be nil on non-matched requests).
`kong.route_name`	The name of the Route matched when processing the request, if Route exists and has a `name` attribute.
`http.method`	The HTTP method used on the original request (only for HTTP requests).
`http.path`	The path of the request (only for HTTP requests).

Additional tags:

If the plugin tags_header config option is set, and the request contains headers with the appropriate name and correct encoding tags, then the trace will include the tags.
If the plugin static_tags config option is set, then the tags in the config option will be included in the trace.

Proxy span

Span kind: CLIENT

There is one proxy span per request, encompassing most of Kong Gateway’s internal processing of a request.

The proxy span contains the following logs/annotations for the start/finish of the of the Kong Gateway plugin phases:

Tag	Phase
`kas`	`kong.access.start`
`kaf`	`kong.access.finish`
`kbs`	`kong.body_filter.start`
`kbf`	`kong.body_filter.finish`
`khs`	`kong.header_filter.start`
`khf`	`kong.header_filter.finish`
`kps`	`kong.preread.start` (only for stream requests)
`kpf`	`kong.preread.finish` (only for stream requests)

Balancer span

Span kind: CLIENT

There are zero or more balancer spans per request, each encompassing one balancer attempt. This span contains the following tags specific to load balancing:

Tag	Description
`kong.balancer.try`	A number indicating the attempt (one for the first load-balancing attempt, two for the second, and so on).
`peer.ipv4` or `peer.ipv6`	The balancer IP.
`peer.port`	The balancer port.
`error`	Set to `true` if the balancing attempt was unsuccessful, otherwise unset.
`http.status_code`	The HTTP status code received, in case of error.
`kong.balancer.state`	An NGINX-specific description of the error, `next/failed` for HTTP failures, or `0` for stream failures. Equivalent to `state_name` in OpenResty’s balancer’s `get_last_failure` function.

Propagation

The Zipkin plugin supports propagation of the following header formats:

w3c: W3C trace context
b3 and b3-single: Zipkin headers
jaeger: Jaeger headers
ot: OpenTracing headers
datadog: Datadog headers
aws: AWS X-Ray header v3.4+
gcp: GCP X-Cloud-Trace-Context header v3.5+

This plugin offers extensive options for configuring tracing header propagation, providing a high degree of flexibility. You can customize which headers are used to extract and inject tracing context. Additionally, you can configure headers to be cleared after the tracing context extraction process, enabling a high level of customization.

 
flowchart LR
   id1(Original Request) --> Extract
   id1(Original Request) -->|"headers (original)"| Extract
   id1(Original Request) --> Extract
   subgraph ide1 [Headers Propagation]
   Extract --> Clear
   Extract -->|"headers (original)"| Clear
   Extract --> Clear
   Clear -->|"headers (filtered)"| Inject
   end
   Extract -.->|extracted ctx| id2((tracing logic))
   id2((tracing logic)) -.->|updated ctx| Inject
   Inject -->|"headers (updated ctx)"| id3(Updated request)

See the plugin’s configuration reference for a complete overview of the available options and values.

Note: If any of the config.propagation.* configuration options (extract, clear, or inject) are configured, the config.propagation configuration takes precedence over the deprecated config.header_type and config.default_header_type parameters. If none of the config.propagation.* configuration options are set, the config.header_type and config.default_header_type parameters are still used to determine the propagation behavior.

In Kong Gateway 3.6 or earlier, the plugin detects the propagation format from the headers and will use the appropriate format to propagate the span context. If no appropriate format is found, the plugin will fallback to the default format, which is b3.

Trace IDs in serialized logs v3.5+

When the Zipkin plugin is configured along with a plugin that uses the Log Serializer, the trace ID of each request is added to the key trace_id in the serialized log output.

The value of this field is an object that can contain different formats of the current request’s trace ID. In case of multiple tracing headers in the same request, the trace_id field includes one trace ID format for each different header format, as in the following example:

"trace_id": {
  "b3": "4bf92f3577b34da6a3ce929d0e0e4736",
  "datadog": "11803532876627986230"
},

Queuing

The Zipkin plugin uses internal queues to decouple the production of log entries from their transmission to the upstream log server.

With queuing, request information is put in a configurable queue before being sent in batches to the upstream server. This has the following benefits:

Reduces any possible concurrency on the upstream server
Helps deal with temporary outages of the upstream server due to network or administrative changes
Can reduce resource usage both in Kong Gateway and on the upstream server by collecting multiple entries from the queue in one request

Note: Because queues are structural elements for components in Kong Gateway, they only live in the main memory of each worker process and are not shared between workers. Therefore, queued content isn’t preserved under abnormal operational situations, like power loss or unexpected worker process shutdown due to memory shortage or program errors.

You can configure several parameters for queuing:

Parameters	Description
Queue capacity limits: `config.queue.max_entries` `config.queue.max_bytes` `config.queue.max_batch_size`	Configure sizes for various aspects of the queue: maximum number of entries, batch size, and queue size in bytes. When a queue reaches the maximum number of entries queued and another entry is enqueued, the oldest entry in the queue is deleted to make space for the new entry. The queue code provides warning log entries when it reaches a capacity threshold of 80% and when it starts to delete entries from the queue. It also writes log entries when the situation normalizes.
Timer usage: `config.queue.concurrency_limit`	Only one timer is used to start queue processing in the background. You can add more if needed. Once the queue is empty, the timer handler terminates and a new timer is created as soon as a new entry is pushed onto the queue.
Retry logic: `config.queue.initial_retry_delay` `config.queue.max_coalescing_delay` `config.queue.max_retry_delay` `config.queue.max_retry_time`	If a queue fails to process, the queue library can automatically retry processing it if the failure is temporary (for example, if there are network problems or upstream unavailability). Before retrying, the library waits for the amount of time specified by the `initial_retry_delay` parameter. This wait time is doubled every time the retry fails, until it reaches the maximum wait time specified by the `max_retry_time` parameter.

When a Kong Gateway shutdown is initiated, the queue is flushed. This allows Kong Gateway to shut down even if it was waiting for new entries to be batched, ensuring upstream servers can be contacted.

Queues are not shared between workers and queuing parameters are scoped to one worker. For whole-system capacity planning, the number of workers needs to be considered when setting queue parameters.