Hi Dom,
If you consider ignoring checkpoint failures, you can use this API:
setTolerableCheckpointFailureNumber[1].
But for Jobs with checkpoints enabled and failed operators containing
states, Flink can't ignore these failures without restarting Jobs.
Subsequent regional recovery may be appropriate for your scenario.
At this stage, if you don't want to restart because of non-critical
operators, you may need to customize related implementations so that those
exceptions are not thrown to the Flink framework.
Best,
Vino
[1]:
https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/CheckpointConfig.java#L319Dominik Wosiński <
[hidden email]> 于2019年10月14日周一 下午8:16写道:
> Hey,
> I have a question that I have not been able to find an answer for in the
> docs nor in any other source. Suppose we have a business system and we are
> using Elasticsearch sink, but not for the purpose of business case, but
> rather for keeping info on the data that is flowing through the system. The
> Elasticsearch part is not crucial for the application, thus I would like to
> keep application running even if the elastic itself is failing (for example
> due to the external system being down). Is there a way to exclude some task
> from checkpointing and ignore it's failure, so that the job is not
> restarted if only one of the sinks is down ??
>
> Thanks in advance,
> Best Regards,
> Dom.
>