Hi,
is it possible, that watermarks are sometimes not propagated to WebUI, although they are internally moving as normal? I see in WebUI every operator showing "No Watermark", but outputs seem to be propagated to sink (and there are watermark sensitive operations involved - e.g. reductions on fixed windows without early emitting). More strangely, this happens when I increase parallelism above some threshold. If I use parallelism of N, watermarks are shown, when I increase it above some number (seems not to be exactly deterministic), watermarks seems to disappear. I'm using Flink 1.8.1. Did anyone experience something like this before? Jan |
I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears with
higher parallelism. This can be confusing to the user when watermarks actually work and can be observed using the metrics. On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]> wrote: > Hi, > > is it possible, that watermarks are sometimes not propagated to WebUI, > although they are internally moving as normal? I see in WebUI every > operator showing "No Watermark", but outputs seem to be propagated to > sink (and there are watermark sensitive operations involved - e.g. > reductions on fixed windows without early emitting). More strangely, > this happens when I increase parallelism above some threshold. If I use > parallelism of N, watermarks are shown, when I increase it above some > number (seems not to be exactly deterministic), watermarks seems to > disappear. > > I'm using Flink 1.8.1. > > Did anyone experience something like this before? > > Jan > > |
Hi,
Thomas, thanks for confirming this. I have noticed, that in 1.9 the WebUI has been reworked a lot, does anyone know if this is still an issue? I currently cannot easily try 1.9, so I cannot confirm or disprove that. Jan On 8/14/19 6:25 PM, Thomas Weise wrote: > I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears with > higher parallelism. > > This can be confusing to the user when watermarks actually work and can be > observed using the metrics. > > On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]> wrote: > >> Hi, >> >> is it possible, that watermarks are sometimes not propagated to WebUI, >> although they are internally moving as normal? I see in WebUI every >> operator showing "No Watermark", but outputs seem to be propagated to >> sink (and there are watermark sensitive operations involved - e.g. >> reductions on fixed windows without early emitting). More strangely, >> this happens when I increase parallelism above some threshold. If I use >> parallelism of N, watermarks are shown, when I increase it above some >> number (seems not to be exactly deterministic), watermarks seems to >> disappear. >> >> I'm using Flink 1.8.1. >> >> Did anyone experience something like this before? >> >> Jan >> >> |
I remember an issue regarding the watermark fetch request from the WebUI
exceeding some HTTP size limit, since it tries to fetch all watermarks at once, and the format of this request isn't exactly efficient. Querying metrics for individual operators still works since the request is small enough. Not sure whether we ever fixed that. On 15/08/2019 12:01, Jan Lukavský wrote: > Hi, > > Thomas, thanks for confirming this. I have noticed, that in 1.9 the > WebUI has been reworked a lot, does anyone know if this is still an > issue? I currently cannot easily try 1.9, so I cannot confirm or > disprove that. > > Jan > > On 8/14/19 6:25 PM, Thomas Weise wrote: >> I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears >> with >> higher parallelism. >> >> This can be confusing to the user when watermarks actually work and >> can be >> observed using the metrics. >> >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]> wrote: >> >>> Hi, >>> >>> is it possible, that watermarks are sometimes not propagated to WebUI, >>> although they are internally moving as normal? I see in WebUI every >>> operator showing "No Watermark", but outputs seem to be propagated to >>> sink (and there are watermark sensitive operations involved - e.g. >>> reductions on fixed windows without early emitting). More strangely, >>> this happens when I increase parallelism above some threshold. If I use >>> parallelism of N, watermarks are shown, when I increase it above some >>> number (seems not to be exactly deterministic), watermarks seems to >>> disappear. >>> >>> I'm using Flink 1.8.1. >>> >>> Did anyone experience something like this before? >>> >>> Jan >>> >>> > |
Jan, will you be able to test this issue on the now-released Flink 1.9 with
the new UI? What parallelism is needed to reproduce the issue? On Thu, Aug 15, 2019 at 1:59 PM Chesnay Schepler <[hidden email]> wrote: > I remember an issue regarding the watermark fetch request from the WebUI > exceeding some HTTP size limit, since it tries to fetch all watermarks > at once, and the format of this request isn't exactly efficient. > > Querying metrics for individual operators still works since the request > is small enough. > > Not sure whether we ever fixed that. > > On 15/08/2019 12:01, Jan Lukavský wrote: > > Hi, > > > > Thomas, thanks for confirming this. I have noticed, that in 1.9 the > > WebUI has been reworked a lot, does anyone know if this is still an > > issue? I currently cannot easily try 1.9, so I cannot confirm or > > disprove that. > > > > Jan > > > > On 8/14/19 6:25 PM, Thomas Weise wrote: > >> I have also noticed this issue (Flink 1.5, Flink 1.8), and it appears > >> with > >> higher parallelism. > >> > >> This can be confusing to the user when watermarks actually work and > >> can be > >> observed using the metrics. > >> > >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email]> wrote: > >> > >>> Hi, > >>> > >>> is it possible, that watermarks are sometimes not propagated to WebUI, > >>> although they are internally moving as normal? I see in WebUI every > >>> operator showing "No Watermark", but outputs seem to be propagated to > >>> sink (and there are watermark sensitive operations involved - e.g. > >>> reductions on fixed windows without early emitting). More strangely, > >>> this happens when I increase parallelism above some threshold. If I use > >>> parallelism of N, watermarks are shown, when I increase it above some > >>> number (seems not to be exactly deterministic), watermarks seems to > >>> disappear. > >>> > >>> I'm using Flink 1.8.1. > >>> > >>> Did anyone experience something like this before? > >>> > >>> Jan > >>> > >>> > > > > |
Hi Robert,
I'd very much love to, but because I run my pipeline with Beam, I'm afraid I will have to wait a little longer, before Beam has runner for 1.9 [1]. I'm pretty sure that the watermarks disappeared with overall parallelism (over all operators) something above 2000. There was quite a lot of operators (shuffling), so the individual parallelism of each operator was about 200. The pipeline was spread over 50 taskmanager (each having 4 slots). Jan [1] https://github.com/apache/beam/pull/9296/ On 8/26/19 10:23 AM, Robert Metzger wrote: > Jan, will you be able to test this issue on the now-released Flink 1.9 > with the new UI? > > What parallelism is needed to reproduce the issue? > > > On Thu, Aug 15, 2019 at 1:59 PM Chesnay Schepler <[hidden email] > <mailto:[hidden email]>> wrote: > > I remember an issue regarding the watermark fetch request from the > WebUI > exceeding some HTTP size limit, since it tries to fetch all > watermarks > at once, and the format of this request isn't exactly efficient. > > Querying metrics for individual operators still works since the > request > is small enough. > > Not sure whether we ever fixed that. > > On 15/08/2019 12:01, Jan Lukavský wrote: > > Hi, > > > > Thomas, thanks for confirming this. I have noticed, that in 1.9 the > > WebUI has been reworked a lot, does anyone know if this is still an > > issue? I currently cannot easily try 1.9, so I cannot confirm or > > disprove that. > > > > Jan > > > > On 8/14/19 6:25 PM, Thomas Weise wrote: > >> I have also noticed this issue (Flink 1.5, Flink 1.8), and it > appears > >> with > >> higher parallelism. > >> > >> This can be confusing to the user when watermarks actually work > and > >> can be > >> observed using the metrics. > >> > >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email] > <mailto:[hidden email]>> wrote: > >> > >>> Hi, > >>> > >>> is it possible, that watermarks are sometimes not propagated > to WebUI, > >>> although they are internally moving as normal? I see in WebUI > every > >>> operator showing "No Watermark", but outputs seem to be > propagated to > >>> sink (and there are watermark sensitive operations involved - e.g. > >>> reductions on fixed windows without early emitting). More > strangely, > >>> this happens when I increase parallelism above some threshold. > If I use > >>> parallelism of N, watermarks are shown, when I increase it > above some > >>> number (seems not to be exactly deterministic), watermarks > seems to > >>> disappear. > >>> > >>> I'm using Flink 1.8.1. > >>> > >>> Did anyone experience something like this before? > >>> > >>> Jan > >>> > >>> > > > |
The issue persists with 1.9.1:
https://issues.apache.org/jira/browse/FLINK-14470 On Mon, Aug 26, 2019 at 1:47 AM Jan Lukavský <[hidden email]> wrote: > Hi Robert, > > I'd very much love to, but because I run my pipeline with Beam, I'm > afraid I will have to wait a little longer, before Beam has runner for > 1.9 [1]. I'm pretty sure that the watermarks disappeared with overall > parallelism (over all operators) something above 2000. There was quite a > lot of operators (shuffling), so the individual parallelism of each > operator was about 200. The pipeline was spread over 50 taskmanager > (each having 4 slots). > > Jan > > [1] https://github.com/apache/beam/pull/9296/ > > On 8/26/19 10:23 AM, Robert Metzger wrote: > > Jan, will you be able to test this issue on the now-released Flink 1.9 > > with the new UI? > > > > What parallelism is needed to reproduce the issue? > > > > > > On Thu, Aug 15, 2019 at 1:59 PM Chesnay Schepler <[hidden email] > > <mailto:[hidden email]>> wrote: > > > > I remember an issue regarding the watermark fetch request from the > > WebUI > > exceeding some HTTP size limit, since it tries to fetch all > > watermarks > > at once, and the format of this request isn't exactly efficient. > > > > Querying metrics for individual operators still works since the > > request > > is small enough. > > > > Not sure whether we ever fixed that. > > > > On 15/08/2019 12:01, Jan Lukavský wrote: > > > Hi, > > > > > > Thomas, thanks for confirming this. I have noticed, that in 1.9 the > > > WebUI has been reworked a lot, does anyone know if this is still an > > > issue? I currently cannot easily try 1.9, so I cannot confirm or > > > disprove that. > > > > > > Jan > > > > > > On 8/14/19 6:25 PM, Thomas Weise wrote: > > >> I have also noticed this issue (Flink 1.5, Flink 1.8), and it > > appears > > >> with > > >> higher parallelism. > > >> > > >> This can be confusing to the user when watermarks actually work > > and > > >> can be > > >> observed using the metrics. > > >> > > >> On Wed, Aug 14, 2019 at 7:36 AM Jan Lukavský <[hidden email] > > <mailto:[hidden email]>> wrote: > > >> > > >>> Hi, > > >>> > > >>> is it possible, that watermarks are sometimes not propagated > > to WebUI, > > >>> although they are internally moving as normal? I see in WebUI > > every > > >>> operator showing "No Watermark", but outputs seem to be > > propagated to > > >>> sink (and there are watermark sensitive operations involved - > e.g. > > >>> reductions on fixed windows without early emitting). More > > strangely, > > >>> this happens when I increase parallelism above some threshold. > > If I use > > >>> parallelism of N, watermarks are shown, when I increase it > > above some > > >>> number (seems not to be exactly deterministic), watermarks > > seems to > > >>> disappear. > > >>> > > >>> I'm using Flink 1.8.1. > > >>> > > >>> Did anyone experience something like this before? > > >>> > > >>> Jan > > >>> > > >>> > > > > > > |
Free forum by Nabble | Edit this page |