Istio’s default retry behavior

Anshul Prakash
FAUN — Developer Community 🐾
3 min readNov 23, 2022

--

This article discusses Istio’s default retry behavior and how to check if it is happening.

Using Istio’s traffic rules one can easily control service-level properties like circuit breakers, timeouts, and retries to make the services more resilient or to setup A/B testing.

But there are certain default behaviors that may not suit your application behavior. One of them being the default retry behavior.

As per Istio’s documentation:

A retry setting specifies the maximum number of times an Envoy proxy attempts to connect to a service if the initial call fails. Retries can enhance service availability and application performance by making sure that calls don’t fail permanently because of transient problems such as a temporarily overloaded service or network. The interval between retries (25ms+) is variable and determined automatically by Istio, preventing the called service from being overwhelmed with requests. The default retry behavior for HTTP requests is to retry twice before returning the error.

So, if you have a service where retrying requests may be expensive (ex. db transactions) this default behavior will cause unexpected application behavior. You would be left wondering why are retries happening!!

I could not find any Istio documentation to check it and hence this article.

TLDR;

  1. Enable debug logs for istio-proxy:
istioctl pc log --level debug deploy/$DEPLOYMENT_NAME -n $NAMESPACE

2. Check the logs for retry behavior

kubectl logs deploy/$DEPLOYMENT_NAME -c istio-proxy -n $NAMESPACE | grep "x-envoy-attempt-count"

Prerequisites

  • Set up Istio by following the instructions in the Installation guide.
  • Deploy the Bookinfo sample application
  • All the below commands assume that you have kubeconfig configured to the namespace in which bookinfo is installed

Set up Virtual Service and Destination Rules

Apply the following definitions in the namespace where bookinfo is installed

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: ratings
spec:
host: ratings
subsets:
- name: v1
labels:
version: v1
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: ratings
spec:
hosts:
- ratings
http:
- route:
- destination:
host: ratings
subset: v1

We are not setting any retry behavior in VirtualService. So, this should cause Istio’s default behavior to kick in.

Enable debug logs for istio-proxy

istioctl pc log --level debug deploy/ratings-v1

Simulate App failure

Patch the ratings deployment to add a sleep

kubectl patch deploy ratings-v1 --patch '{"spec":{"template":{"spec":{"containers":[{"name":"ratings","command":["sleep","1h"]}]}}}}'

Send some traffic

Use a curl pod to send some traffic to ratings service

kubectl run -it curl --image=curlimages/curl:7.73.0 --rm  -- sh
curl -v http://ratings:9080/ratings/1

Check istio-proxy logs

kubectl logs deploy/ratings -c istio-proxy | grep "x-envoy-attempt-count"

You should see such logs showing retry was attempted:

'x-envoy-attempt-count', '1'
'x-envoy-attempt-count', '1'
'x-envoy-attempt-count', '2'
'x-envoy-attempt-count', '2'
'x-envoy-attempt-count', '3'
'x-envoy-attempt-count', '3'

You can get more details to see routing

kubectl logs deploy/ratings-v1 -c istio-proxy | grep -B 10 "x-envoy-attempt-count"
2022-11-23T17:44:25.179776Z debug envoy router [C49][S10562819190590727274] cluster 'inbound|9080||' match for URL '/ratings/1'
2022-11-23T17:44:25.179821Z debug envoy router [C49][S10562819190590727274] router decoding headers:
':authority', 'ratings:9080'
':path', '/ratings/1'
':method', 'GET'
':scheme', 'http'
'user-agent', 'curl/7.73.0-DEV'
'accept', '*/*'
'x-forwarded-proto', 'http'
'x-request-id', '04dc377f-5052-92f8-9aa3-8a52069172df'
'x-envoy-attempt-count', '3'

👋 If you find this helpful, please click the clap 👏 button below a few times to show your support for the author 👇

🚀Join FAUN Developer Community & Get Similar Stories in your Inbox Each Week

--

--