Hey all,
I am using the KeyedCoProcessFunction class in Flink DataStream APIs to implement a timeout like use case. The scenario is as follows: I have an input kafka topic and an output Kafka topic, a service reads from the input topic processes it (for variable amount of time) and then publishes the response in the output kafka topic. Now to implement the timeout (must be using Flink datastream APIs), I have a FlinkKafkaConsumer that reads from the kafka input topic, and another FlinkKafkaConsumer that reads from the kafka output topic (once processed and published by the external service). I am connecting the two streams, and using the processElement1 I am registering a timer and waiting either that the onTimer method be fired (a timeout is declared), or the processElement2 is fired before and hence I delete the timer and do not declare a timeout. In the situation described above can the scenario of reading an element from the output topic (processElement2 is fired) happen before reading from the input topic (processElement1 is fired) knowing that the time taken to process the element by the external service might take seconds before publishing it to the output topic, is it possible? is that how by design Flink works, are there any way to force Flink connected streams to operate based first comes first served. In such case what is the best case to implement the timeout functionality as described above strictly using the Flink DataStream APIs, Any hint please? Thank you so much. -- Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ |
Mazen, questions like this are better suited to the user mailing list.
FYI, this is also being discussed on stackoverflow: https://stackoverflow.com/questions/63902457/flink-timeout-using-keyedcoprocessfunction-and-order-of-reading-for-flinkkafkaco Regards, David On Wed, Sep 16, 2020 at 9:41 AM Mazen Ezzeddine < [hidden email]> wrote: > Hey all, > > I am using the KeyedCoProcessFunction class in Flink DataStream APIs to > implement a timeout like use case. The scenario is as follows: I have an > input kafka topic and an output Kafka topic, a service reads from the input > topic processes it (for variable amount of time) and then publishes the > response in the output kafka topic. > > Now to implement the timeout (must be using Flink datastream APIs), I have > a > FlinkKafkaConsumer that reads from the kafka input topic, and another > FlinkKafkaConsumer that reads from the kafka output topic (once processed > and published by the external service). I am connecting the two streams, > and > using the processElement1 I am registering a timer and waiting either that > the onTimer method be fired (a timeout is declared), or the processElement2 > is fired before and hence I delete the timer and do not declare a timeout. > > In the situation described above can the scenario of reading an element > from the output topic (processElement2 is fired) happen before reading from > the input topic (processElement1 is fired) knowing that the time taken to > process the element by the external service might take seconds before > publishing it to the output topic, is it possible? is that how by design > Flink works, are there any way to force Flink connected streams to operate > based first comes first served. > > In such case what is the best case to implement the timeout functionality > as > described above strictly using the Flink DataStream APIs, Any hint please? > > Thank you so much. > > > > -- > Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ > |
OK, thanks so much David very helpful.
Sorry for any inconvenience. -- Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ |
Free forum by Nabble | Edit this page |