Expose per partition Kafka lag metric in Flink Kafka connector

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Expose per partition Kafka lag metric in Flink Kafka connector

Shuyi Chen
Hi all,

We found that Flink's Kafka connector does not expose the per-partition
Kafka lag. The metric is available in KafkaConsumer after Kafka 0.10.2. And
it's an important metric to diagnose which Kafka partition is lagging in
production. I've created a JIRA (
https://issues.apache.org/jira/browse/FLINK-11912) and created a proposed
change in Flink to register and expose the metrics. Could someone help take
a look and give some suggestions? Thanks a lot.

Shuyi
Reply | Threaded
Open this post in threaded view
|

Re: Expose per partition Kafka lag metric in Flink Kafka connector

Becket Qin
Hi Shuyi,

Thanks for bringing this issue up. Per partition lag is definitely
something that should be exposed. I replied to the JIRA with some of my
concerns. Do you mind keeping the discussion in the JIRA ticket so it is
easier for future readers to follow the issue?

Thanks,

Jiangjie (Becket) Qin

On Mon, Mar 18, 2019 at 2:14 PM Shuyi Chen <[hidden email]> wrote:

> Hi all,
>
> We found that Flink's Kafka connector does not expose the per-partition
> Kafka lag. The metric is available in KafkaConsumer after Kafka 0.10.2. And
> it's an important metric to diagnose which Kafka partition is lagging in
> production. I've created a JIRA (
> https://issues.apache.org/jira/browse/FLINK-11912) and created a proposed
> change in Flink to register and expose the metrics. Could someone help take
> a look and give some suggestions? Thanks a lot.
>
> Shuyi
>
Reply | Threaded
Open this post in threaded view
|

Re: Expose per partition Kafka lag metric in Flink Kafka connector

Shuyi Chen
Thanks a lot, Becket. I am sorry that I was out of the loop for the last
few days due to sickness. Let’s keep the discussion continue on the JIRA.

Shuyi

On Mon, Mar 25, 2019 at 4:46 AM Becket Qin <[hidden email]> wrote:

> Hi Shuyi,
>
> Thanks for bringing this issue up. Per partition lag is definitely
> something that should be exposed. I replied to the JIRA with some of my
> concerns. Do you mind keeping the discussion in the JIRA ticket so it is
> easier for future readers to follow the issue?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Mon, Mar 18, 2019 at 2:14 PM Shuyi Chen <[hidden email]> wrote:
>
> > Hi all,
> >
> > We found that Flink's Kafka connector does not expose the per-partition
> > Kafka lag. The metric is available in KafkaConsumer after Kafka 0.10.2.
> And
> > it's an important metric to diagnose which Kafka partition is lagging in
> > production. I've created a JIRA (
> > https://issues.apache.org/jira/browse/FLINK-11912) and created a
> proposed
> > change in Flink to register and expose the metrics. Could someone help
> take
> > a look and give some suggestions? Thanks a lot.
> >
> > Shuyi
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Expose per partition Kafka lag metric in Flink Kafka connector

Shuyi Chen
ping to bump this topic. Can we get some more input here or on the JIRA on
the agreement to resolve this issue? Thanks a lot.

Shuyi

On Mon, Mar 25, 2019 at 8:43 AM Shuyi Chen <[hidden email]> wrote:

> Thanks a lot, Becket. I am sorry that I was out of the loop for the last
> few days due to sickness. Let’s keep the discussion continue on the JIRA.
>
> Shuyi
>
> On Mon, Mar 25, 2019 at 4:46 AM Becket Qin <[hidden email]> wrote:
>
>> Hi Shuyi,
>>
>> Thanks for bringing this issue up. Per partition lag is definitely
>> something that should be exposed. I replied to the JIRA with some of my
>> concerns. Do you mind keeping the discussion in the JIRA ticket so it is
>> easier for future readers to follow the issue?
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>> On Mon, Mar 18, 2019 at 2:14 PM Shuyi Chen <[hidden email]> wrote:
>>
>> > Hi all,
>> >
>> > We found that Flink's Kafka connector does not expose the per-partition
>> > Kafka lag. The metric is available in KafkaConsumer after Kafka 0.10.2.
>> And
>> > it's an important metric to diagnose which Kafka partition is lagging in
>> > production. I've created a JIRA (
>> > https://issues.apache.org/jira/browse/FLINK-11912) and created a
>> proposed
>> > change in Flink to register and expose the metrics. Could someone help
>> take
>> > a look and give some suggestions? Thanks a lot.
>> >
>> > Shuyi
>> >
>>
>