I have the following scenario: I have a Quarkus Kafka consumer that gets events (in JSON) from a topic and sends it on to a downstream system via their own web service).
However sometimes there is a network hiccup or the target system is just DOWN (maintenance etc).
I currently have the consumer set up to to retry indefinitely but I notice that it retries a few times and then just marks the consumer as BAD in the health endpoint and stops. Now this is fine as I can then manually just restart the consumer, but I would rather have the consumer not rely on human intervention.
My current code looks as follows:
@Incoming("fin-in") @Retry(delay = 120, delayUnit = ChronoUnit.SECONDS, maxRetries = -1, maxDuration = 300, durationUnit = ChronoUnit.SECONDS) public void receive(ConsumerRecord<String, String> event) throws Exception, RetryException {try { //Send to Client endpoint CloseableHttpClient target = HttpClients.createDefault(); HttpPost httpPost = new HttpPost(endpoint); UsernamePasswordCredentials creds = new UsernamePasswordCredentials(userName, passwd); httpPost.addHeader(new BasicScheme().authenticate(creds, httpPost, null)); StringEntity entity = new StringEntity(event.value()); httpPost.setEntity(entity); httpPost.setHeader("Accept", "application/json"); httpPost.setHeader("Content-type", "application/json"); CloseableHttpResponse response = target.execute(httpPost); if (response.getStatusLine().getStatusCode() == okStatus) { LG.info("Event sent successfully"); } else { if (response.getStatusLine().getStatusCode() == failStatus) { // Never able to process event, log error and continue to next event HttpEntity hEntity = response.getEntity(); String entityJSON = EntityUtils.toString(hEntity); LG.error("Fatal error sending to Client endpoint: "+ failStatus +" " + response.getStatusLine().getReasonPhrase() +" " + entityJSON); logStdError("Error Sending record to Client endpoint" + response.getStatusLine().getReasonPhrase(), event.value(), "ClientEvent", topic, "Client", event.key()); } else { HttpEntity hEntity = response.getEntity(); String entityJSON = EntityUtils.toString(hEntity); LG.error("Unable to send to Client target, retrying: " + response.getStatusLine().getStatusCode() +" " + response.getStatusLine().getReasonPhrase() +" " + entityJSON); logStdError("Error Sending record to Client endpoint, http status = " + response.getStatusLine().getStatusCode() +" retrying later " + response.getStatusLine().getReasonPhrase(), event.value(), "ClientEvent", topic, "Client", event.key()); throw new RetryException("RETRY_EXCEPTION", "Unable to send to Client target, retrying " ++response.getStatusLine().getStatusCode() +" " + response.getStatusLine().getReasonPhrase()); } } } catch (RetryException rEx) { throw rEx; } catch (Exception ex) { logStdError("Error sending event to Client endpoint: " + ex.getMessage(), event.value(), "ClientEvent", topic, "Client", event.key()); throw ex; }
What I would like to implement is a indefinite retry but with an exponential back off. I.e. After the first failure, retry the same event in 5 minutes. If it fails again, wait 10 minutes, and if again 20 minutes etc etc etc (Effectively doubling the time between retries).