Watermarks not propagated to WebUI?

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Watermarks not propagated to WebUI?

Jan Lukavský
Hi,

is it possible, that watermarks are sometimes not propagated to WebUI,
although they are internally moving as normal? I see in WebUI every
operator showing "No Watermark", but outputs seem to be propagated to
sink (and there are watermark sensitive operations involved - e.g.
reductions on fixed windows without early emitting). More strangely,
this happens when I increase parallelism above some threshold. If I use
parallelism of N, watermarks are shown, when I increase it above some
number (seems not to be exactly deterministic), watermarks seems to
disappear.

I'm using Flink 1.8.1.

Did anyone experience something like this before?

Jan

Reply | Threaded
Open this post in threaded view
|

Re: Watermarks not propagated to WebUI?

Thomas Weise
I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears with
higher parallelism.

This can be confusing to the user when watermarks actually work and can be
observed using the metrics.

On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]> wrote:

> Hi,
>
> is it possible, that watermarks are sometimes not propagated to WebUI,
> although they are internally moving as normal? I see in WebUI every
> operator showing "No Watermark", but outputs seem to be propagated to
> sink (and there are watermark sensitive operations involved - e.g.
> reductions on fixed windows without early emitting). More strangely,
> this happens when I increase parallelism above some threshold. If I use
> parallelism of N, watermarks are shown, when I increase it above some
> number (seems not to be exactly deterministic), watermarks seems to
> disappear.
>
> I'm using Flink 1.8.1.
>
> Did anyone experience something like this before?
>
> Jan
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Watermarks not propagated to WebUI?

Jan Lukavský
Hi,

Thomas, thanks for confirming this. I have noticed, that in 1.9 the
WebUI has been reworked a lot, does anyone know if this is still an
issue? I currently cannot easily try 1.9, so I cannot confirm or
disprove that.

Jan

On 8/14/19 6:25 PM, Thomas Weise wrote:

> I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears with
> higher parallelism.
>
> This can be confusing to the user when watermarks actually work and can be
> observed using the metrics.
>
> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]> wrote:
>
>> Hi,
>>
>> is it possible, that watermarks are sometimes not propagated to WebUI,
>> although they are internally moving as normal? I see in WebUI every
>> operator showing "No Watermark", but outputs seem to be propagated to
>> sink (and there are watermark sensitive operations involved - e.g.
>> reductions on fixed windows without early emitting). More strangely,
>> this happens when I increase parallelism above some threshold. If I use
>> parallelism of N, watermarks are shown, when I increase it above some
>> number (seems not to be exactly deterministic), watermarks seems to
>> disappear.
>>
>> I'm using Flink 1.8.1.
>>
>> Did anyone experience something like this before?
>>
>> Jan
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Watermarks not propagated to WebUI?

Chesnay Schepler-3
I remember an issue regarding the watermark fetch request from the WebUI
exceeding some HTTP size limit, since it tries to fetch all watermarks
at once, and the format of this request isn't exactly efficient.

Querying metrics for individual operators still works since the request
is small enough.

Not sure whether we ever fixed that.

On 15/08/2019 12:01, Jan Lukavský wrote:

> Hi,
>
> Thomas, thanks for confirming this. I have noticed, that in 1.9 the
> WebUI has been reworked a lot, does anyone know if this is still an
> issue? I currently cannot easily try 1.9, so I cannot confirm or
> disprove that.
>
> Jan
>
> On 8/14/19 6:25 PM, Thomas Weise wrote:
>> I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears
>> with
>> higher parallelism.
>>
>> This can be confusing to the user when watermarks actually work and
>> can be
>> observed using the metrics.
>>
>> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]> wrote:
>>
>>> Hi,
>>>
>>> is it possible, that watermarks are sometimes not propagated to WebUI,
>>> although they are internally moving as normal? I see in WebUI every
>>> operator showing "No Watermark", but outputs seem to be propagated to
>>> sink (and there are watermark sensitive operations involved - e.g.
>>> reductions on fixed windows without early emitting). More strangely,
>>> this happens when I increase parallelism above some threshold. If I use
>>> parallelism of N, watermarks are shown, when I increase it above some
>>> number (seems not to be exactly deterministic), watermarks seems to
>>> disappear.
>>>
>>> I'm using Flink 1.8.1.
>>>
>>> Did anyone experience something like this before?
>>>
>>> Jan
>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Watermarks not propagated to WebUI?

Robert Metzger
Jan, will you be able to test this issue on the now-released Flink 1.9 with
the new UI?

What parallelism is needed to reproduce the issue?


On Thu, Aug 15, 2019 at 1:59 PM Chesnay Schepler <[hidden email]> wrote:

> I remember an issue regarding the watermark fetch request from the WebUI
> exceeding some HTTP size limit, since it tries to fetch all watermarks
> at once, and the format of this request isn't exactly efficient.
>
> Querying metrics for individual operators still works since the request
> is small enough.
>
> Not sure whether we ever fixed that.
>
> On 15/08/2019 12:01, Jan Lukavský wrote:
> > Hi,
> >
> > Thomas, thanks for confirming this. I have noticed, that in 1.9 the
> > WebUI has been reworked a lot, does anyone know if this is still an
> > issue? I currently cannot easily try 1.9, so I cannot confirm or
> > disprove that.
> >
> > Jan
> >
> > On 8/14/19 6:25 PM, Thomas Weise wrote:
> >> I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears
> >> with
> >> higher parallelism.
> >>
> >> This can be confusing to the user when watermarks actually work and
> >> can be
> >> observed using the metrics.
> >>
> >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]> wrote:
> >>
> >>> Hi,
> >>>
> >>> is it possible, that watermarks are sometimes not propagated to WebUI,
> >>> although they are internally moving as normal? I see in WebUI every
> >>> operator showing "No Watermark", but outputs seem to be propagated to
> >>> sink (and there are watermark sensitive operations involved - e.g.
> >>> reductions on fixed windows without early emitting). More strangely,
> >>> this happens when I increase parallelism above some threshold. If I use
> >>> parallelism of N, watermarks are shown, when I increase it above some
> >>> number (seems not to be exactly deterministic), watermarks seems to
> >>> disappear.
> >>>
> >>> I'm using Flink 1.8.1.
> >>>
> >>> Did anyone experience something like this before?
> >>>
> >>> Jan
> >>>
> >>>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Watermarks not propagated to WebUI?

Jan Lukavský
Hi Robert,

I'd very much love to, but because I run my pipeline with Beam, I'm
afraid I will have to wait a little longer, before Beam has runner for
1.9 [1]. I'm pretty sure that the watermarks disappeared with overall
parallelism (over all operators) something above 2000. There was quite a
lot of operators (shuffling), so the individual parallelism of each
operator was about 200. The pipeline was spread over 50 taskmanager
(each having 4 slots).

Jan

[1] https://github.com/apache/beam/pull/9296/

On 8/26/19 10:23 AM, Robert Metzger wrote:

> Jan, will you be able to test this issue on the now-released Flink 1.9
> with the new UI?
>
> What parallelism is needed to reproduce the issue?
>
>
> On Thu, Aug 15, 2019 at 1:59 PM Chesnay Schepler <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>     I remember an issue regarding the watermark fetch request from the
>     WebUI
>     exceeding some HTTP size limit, since it tries to fetch all
>     watermarks
>     at once, and the format of this request isn't exactly efficient.
>
>     Querying metrics for individual operators still works since the
>     request
>     is small enough.
>
>     Not sure whether we ever fixed that.
>
>     On 15/08/2019 12:01, Jan Lukavský wrote:
>     > Hi,
>     >
>     > Thomas, thanks for confirming this. I have noticed, that in 1.9 the
>     > WebUI has been reworked a lot, does anyone know if this is still an
>     > issue? I currently cannot easily try 1.9, so I cannot confirm or
>     > disprove that.
>     >
>     > Jan
>     >
>     > On 8/14/19 6:25 PM, Thomas Weise wrote:
>     >> I have also noticed this issue (Flink 1.5, Flink 1.8), and it
>     appears
>     >> with
>     >> higher parallelism.
>     >>
>     >> This can be confusing to the user when watermarks actually work
>     and
>     >> can be
>     >> observed using the metrics.
>     >>
>     >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]
>     <mailto:[hidden email]>> wrote:
>     >>
>     >>> Hi,
>     >>>
>     >>> is it possible, that watermarks are sometimes not propagated
>     to WebUI,
>     >>> although they are internally moving as normal? I see in WebUI
>     every
>     >>> operator showing "No Watermark", but outputs seem to be
>     propagated to
>     >>> sink (and there are watermark sensitive operations involved - e.g.
>     >>> reductions on fixed windows without early emitting). More
>     strangely,
>     >>> this happens when I increase parallelism above some threshold.
>     If I use
>     >>> parallelism of N, watermarks are shown, when I increase it
>     above some
>     >>> number (seems not to be exactly deterministic), watermarks
>     seems to
>     >>> disappear.
>     >>>
>     >>> I'm using Flink 1.8.1.
>     >>>
>     >>> Did anyone experience something like this before?
>     >>>
>     >>> Jan
>     >>>
>     >>>
>     >
>
Reply | Threaded
Open this post in threaded view
|

Re: Watermarks not propagated to WebUI?

Thomas Weise
The issue persists with 1.9.1:

https://issues.apache.org/jira/browse/FLINK-14470


On Mon, Aug 26, 2019 at 1:47 AM Jan Lukavský <[hidden email]> wrote:

> Hi Robert,
>
> I'd very much love to, but because I run my pipeline with Beam, I'm
> afraid I will have to wait a little longer, before Beam has runner for
> 1.9 [1]. I'm pretty sure that the watermarks disappeared with overall
> parallelism (over all operators) something above 2000. There was quite a
> lot of operators (shuffling), so the individual parallelism of each
> operator was about 200. The pipeline was spread over 50 taskmanager
> (each having 4 slots).
>
> Jan
>
> [1] https://github.com/apache/beam/pull/9296/
>
> On 8/26/19 10:23 AM, Robert Metzger wrote:
> > Jan, will you be able to test this issue on the now-released Flink 1.9
> > with the new UI?
> >
> > What parallelism is needed to reproduce the issue?
> >
> >
> > On Thu, Aug 15, 2019 at 1:59 PM Chesnay Schepler <[hidden email]
> > <mailto:[hidden email]>> wrote:
> >
> >     I remember an issue regarding the watermark fetch request from the
> >     WebUI
> >     exceeding some HTTP size limit, since it tries to fetch all
> >     watermarks
> >     at once, and the format of this request isn't exactly efficient.
> >
> >     Querying metrics for individual operators still works since the
> >     request
> >     is small enough.
> >
> >     Not sure whether we ever fixed that.
> >
> >     On 15/08/2019 12:01, Jan Lukavský wrote:
> >     > Hi,
> >     >
> >     > Thomas, thanks for confirming this. I have noticed, that in 1.9 the
> >     > WebUI has been reworked a lot, does anyone know if this is still an
> >     > issue? I currently cannot easily try 1.9, so I cannot confirm or
> >     > disprove that.
> >     >
> >     > Jan
> >     >
> >     > On 8/14/19 6:25 PM, Thomas Weise wrote:
> >     >> I have also noticed this issue (Flink 1.5, Flink 1.8), and it
> >     appears
> >     >> with
> >     >> higher parallelism.
> >     >>
> >     >> This can be confusing to the user when watermarks actually work
> >     and
> >     >> can be
> >     >> observed using the metrics.
> >     >>
> >     >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]
> >     <mailto:[hidden email]>> wrote:
> >     >>
> >     >>> Hi,
> >     >>>
> >     >>> is it possible, that watermarks are sometimes not propagated
> >     to WebUI,
> >     >>> although they are internally moving as normal? I see in WebUI
> >     every
> >     >>> operator showing "No Watermark", but outputs seem to be
> >     propagated to
> >     >>> sink (and there are watermark sensitive operations involved -
> e.g.
> >     >>> reductions on fixed windows without early emitting). More
> >     strangely,
> >     >>> this happens when I increase parallelism above some threshold.
> >     If I use
> >     >>> parallelism of N, watermarks are shown, when I increase it
> >     above some
> >     >>> number (seems not to be exactly deterministic), watermarks
> >     seems to
> >     >>> disappear.
> >     >>>
> >     >>> I'm using Flink 1.8.1.
> >     >>>
> >     >>> Did anyone experience something like this before?
> >     >>>
> >     >>> Jan
> >     >>>
> >     >>>
> >     >
> >
>