[DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Yadong Xie
Hi all

Flink Web UI is the main platform for most users to monitor their jobs and
clusters. We have reconstructed Flink web in 1.9.0 version, but there are
still some shortcomings.

This discussion thread aims to provide a better experience for Flink UI
users.

Here is the design doc I drafted:

https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing


The FLIP can be found at [2].

Please keep the discussion here, in the mailing list.

Looking forward to your opinions, any feedbacks are welcome.

[1]:
https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
<https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#>
[2]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

PaulLam
Hi Yadong,

Thanks a lot for summing up the Web UI efforts.

I have a minor suggestion: can we provide a collapse button for the task names in job graph visualization? For some complex jobs, especially SQL jobs, the task names are quite long which makes the job graph hard to read.

Best,
Paul Lam

> 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
>
> Hi all
>
> Flink Web UI is the main platform for most users to monitor their jobs and
> clusters. We have reconstructed Flink web in 1.9.0 version, but there are
> still some shortcomings.
>
> This discussion thread aims to provide a better experience for Flink UI
> users.
>
> Here is the design doc I drafted:
>
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
>
>
> The FLIP can be found at [2].
>
> Please keep the discussion here, in the mailing list.
>
> Looking forward to your opinions, any feedbacks are welcome.
>
> [1]:
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> <https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#>
> [2]:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Yadong Xie
Hi Paul
Thanks for your suggestion.
I think it is easy to implement, could you create a JIRA for me?

Paul Lam <[hidden email]> 于2019年9月27日周五 上午11:11写道:

> Hi Yadong,
>
> Thanks a lot for summing up the Web UI efforts.
>
> I have a minor suggestion: can we provide a collapse button for the task
> names in job graph visualization? For some complex jobs, especially SQL
> jobs, the task names are quite long which makes the job graph hard to read.
>
> Best,
> Paul Lam
>
> > 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
> >
> > Hi all
> >
> > Flink Web UI is the main platform for most users to monitor their jobs
> and
> > clusters. We have reconstructed Flink web in 1.9.0 version, but there are
> > still some shortcomings.
> >
> > This discussion thread aims to provide a better experience for Flink UI
> > users.
> >
> > Here is the design doc I drafted:
> >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> >
> >
> > The FLIP can be found at [2].
> >
> > Please keep the discussion here, in the mailing list.
> >
> > Looking forward to your opinions, any feedbacks are welcome.
> >
> > [1]:
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > <
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> >
> > [2]:
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

PaulLam
Filed a jira to track this[1].  Thanks a lot.

[1] https://issues.apache.org/jira/browse/FLINK-14242 <https://issues.apache.org/jira/browse/FLINK-14242>

Best,
Paul Lam

> 在 2019年9月27日,14:34,Yadong Xie <[hidden email]> 写道:
>
> Hi Paul
> Thanks for your suggestion.
> I think it is easy to implement, could you create a JIRA for me?
>
> Paul Lam <[hidden email]> 于2019年9月27日周五 上午11:11写道:
>
>> Hi Yadong,
>>
>> Thanks a lot for summing up the Web UI efforts.
>>
>> I have a minor suggestion: can we provide a collapse button for the task
>> names in job graph visualization? For some complex jobs, especially SQL
>> jobs, the task names are quite long which makes the job graph hard to read.
>>
>> Best,
>> Paul Lam
>>
>>> 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
>>>
>>> Hi all
>>>
>>> Flink Web UI is the main platform for most users to monitor their jobs
>> and
>>> clusters. We have reconstructed Flink web in 1.9.0 version, but there are
>>> still some shortcomings.
>>>
>>> This discussion thread aims to provide a better experience for Flink UI
>>> users.
>>>
>>> Here is the design doc I drafted:
>>>
>>>
>> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
>>>
>>>
>>> The FLIP can be found at [2].
>>>
>>> Please keep the discussion here, in the mailing list.
>>>
>>> Looking forward to your opinions, any feedbacks are welcome.
>>>
>>> [1]:
>>>
>> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
>>> <
>> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
>>>
>>> [2]:
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Xintong Song
Thanks for drafting the FLIP and starting this discussion, Yadong.


I have some comments:


   - I can see that the proposed memory and cpu usage to be displayed (in
   section 1.1) are aligned with the current ResourceProfile fields. However,
   we are working on changing the memory fields in 1.10 with FLIP-49 [1]. I
   suggest we align the UI design with the new FLIP-49 memory fields.
   - The task executor overview design (in section 1.2) is based on the
   current slot model. The coming FLIP-56 [2] which is also planned for 1.10
   is changing the model so that task executors no longer have fixed number of
   slots, but allocated slots (may have different resources) and available
   resources.
      - I can see that there's discussions in the google doc about using
      different color for available resources. However, the resource
availability
      for different fields can be different, and may not be simply
displayed by a
      different color. E.g., a task executor may have two slot, while slot 1
      takes (20% cpu, 10% heap mem, 50% managed mem, etc.), slot 2 takes (10%
      cpu,  35% heap mem, 0% managed mem etc.), and the remaining resources in
      the task executor are (70% cpu, 55% heap mem, 50% managed mem, etc.). How
      do you plan to display that?
      - I would suggest to have multiple bars for each task executor, while
      each bar represents one of the resource fields. In addition, we
may have a
      number (or some other figures) showing how many slots are allocated from
      the task executor.
   - Is there any way we provide access to logs of terminated task
   executors? It occurs to us a lot that a job failed due to a task executor
   fail/lost. And we have to find the logs of failed task executors by
   manually accessing the file system. I think it would be helpful if we can
   find the logs of failed task executors directly in flink webui.
   - Regarding log pagination, is there any way to provide keyword
   searching across all the pages?


Thank you~

Xintong Song


[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation

On Fri, Sep 27, 2019 at 3:57 PM Paul Lam <[hidden email]> wrote:

> Filed a jira to track this[1].  Thanks a lot.
>
> [1] https://issues.apache.org/jira/browse/FLINK-14242 <
> https://issues.apache.org/jira/browse/FLINK-14242>
>
> Best,
> Paul Lam
>
> > 在 2019年9月27日,14:34,Yadong Xie <[hidden email]> 写道:
> >
> > Hi Paul
> > Thanks for your suggestion.
> > I think it is easy to implement, could you create a JIRA for me?
> >
> > Paul Lam <[hidden email]> 于2019年9月27日周五 上午11:11写道:
> >
> >> Hi Yadong,
> >>
> >> Thanks a lot for summing up the Web UI efforts.
> >>
> >> I have a minor suggestion: can we provide a collapse button for the task
> >> names in job graph visualization? For some complex jobs, especially SQL
> >> jobs, the task names are quite long which makes the job graph hard to
> read.
> >>
> >> Best,
> >> Paul Lam
> >>
> >>> 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
> >>>
> >>> Hi all
> >>>
> >>> Flink Web UI is the main platform for most users to monitor their jobs
> >> and
> >>> clusters. We have reconstructed Flink web in 1.9.0 version, but there
> are
> >>> still some shortcomings.
> >>>
> >>> This discussion thread aims to provide a better experience for Flink UI
> >>> users.
> >>>
> >>> Here is the design doc I drafted:
> >>>
> >>>
> >>
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> >>>
> >>>
> >>> The FLIP can be found at [2].
> >>>
> >>> Please keep the discussion here, in the mailing list.
> >>>
> >>> Looking forward to your opinions, any feedbacks are welcome.
> >>>
> >>> [1]:
> >>>
> >>
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> >>> <
> >>
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> >>>
> >>> [2]:
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Yadong Xie
Hi Xintong Song

Thanks for your comments!

1. I think it is a good idea that to align CPU and memory usage with
FLIP-49 if it will release in version 1.10
2. We can update the task executor UI design after FLIP-56 merged into
master. Actually, the image
<https://cwiki.apache.org/confluence/download/attachments/125309297/BlinkResourceTM.png?version=1&modificationDate=1566223821000&api=v2>
in FLIP-56 is a good UI design, we can follow it in the Flink web.
3. No idea about it, maybe anyone famailar with the runtime part could
answer it? but it would be great to add it to the web UI in my opinion.
4. I'm not sure will keyword searching across all the pages may cost too
many resources in job manager, but I think it would be very useful if the
REST API could support it.

Best,
Yadong

Xintong Song <[hidden email]> 于2019年9月29日周日 下午8:11写道:

> Thanks for drafting the FLIP and starting this discussion, Yadong.
>
>
> I have some comments:
>
>
>    - I can see that the proposed memory and cpu usage to be displayed (in
>    section 1.1) are aligned with the current ResourceProfile fields.
> However,
>    we are working on changing the memory fields in 1.10 with FLIP-49 [1]. I
>    suggest we align the UI design with the new FLIP-49 memory fields.
>    - The task executor overview design (in section 1.2) is based on the
>    current slot model. The coming FLIP-56 [2] which is also planned for
> 1.10
>    is changing the model so that task executors no longer have fixed
> number of
>    slots, but allocated slots (may have different resources) and available
>    resources.
>       - I can see that there's discussions in the google doc about using
>       different color for available resources. However, the resource
> availability
>       for different fields can be different, and may not be simply
> displayed by a
>       different color. E.g., a task executor may have two slot, while slot
> 1
>       takes (20% cpu, 10% heap mem, 50% managed mem, etc.), slot 2 takes
> (10%
>       cpu,  35% heap mem, 0% managed mem etc.), and the remaining
> resources in
>       the task executor are (70% cpu, 55% heap mem, 50% managed mem,
> etc.). How
>       do you plan to display that?
>       - I would suggest to have multiple bars for each task executor, while
>       each bar represents one of the resource fields. In addition, we
> may have a
>       number (or some other figures) showing how many slots are allocated
> from
>       the task executor.
>    - Is there any way we provide access to logs of terminated task
>    executors? It occurs to us a lot that a job failed due to a task
> executor
>    fail/lost. And we have to find the logs of failed task executors by
>    manually accessing the file system. I think it would be helpful if we
> can
>    find the logs of failed task executors directly in flink webui.
>    - Regarding log pagination, is there any way to provide keyword
>    searching across all the pages?
>
>
> Thank you~
>
> Xintong Song
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
>
> On Fri, Sep 27, 2019 at 3:57 PM Paul Lam <[hidden email]> wrote:
>
> > Filed a jira to track this[1].  Thanks a lot.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-14242 <
> > https://issues.apache.org/jira/browse/FLINK-14242>
> >
> > Best,
> > Paul Lam
> >
> > > 在 2019年9月27日,14:34,Yadong Xie <[hidden email]> 写道:
> > >
> > > Hi Paul
> > > Thanks for your suggestion.
> > > I think it is easy to implement, could you create a JIRA for me?
> > >
> > > Paul Lam <[hidden email]> 于2019年9月27日周五 上午11:11写道:
> > >
> > >> Hi Yadong,
> > >>
> > >> Thanks a lot for summing up the Web UI efforts.
> > >>
> > >> I have a minor suggestion: can we provide a collapse button for the
> task
> > >> names in job graph visualization? For some complex jobs, especially
> SQL
> > >> jobs, the task names are quite long which makes the job graph hard to
> > read.
> > >>
> > >> Best,
> > >> Paul Lam
> > >>
> > >>> 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
> > >>>
> > >>> Hi all
> > >>>
> > >>> Flink Web UI is the main platform for most users to monitor their
> jobs
> > >> and
> > >>> clusters. We have reconstructed Flink web in 1.9.0 version, but there
> > are
> > >>> still some shortcomings.
> > >>>
> > >>> This discussion thread aims to provide a better experience for Flink
> UI
> > >>> users.
> > >>>
> > >>> Here is the design doc I drafted:
> > >>>
> > >>>
> > >>
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > >>>
> > >>>
> > >>> The FLIP can be found at [2].
> > >>>
> > >>> Please keep the discussion here, in the mailing list.
> > >>>
> > >>> Looking forward to your opinions, any feedbacks are welcome.
> > >>>
> > >>> [1]:
> > >>>
> > >>
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > >>> <
> > >>
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > >>>
> > >>> [2]:
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > >>
> > >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Till Rohrmann
For 3. At the moment the log and stdout file serving requires the
TaskExecutor to be running. But in some scenarios when having a NFS, it
should be enough to know where the file is located. However, this
assumption does not hold in the general case.

Cheers,
Till

On Mon, Sep 30, 2019 at 11:43 AM Yadong Xie <[hidden email]> wrote:

> Hi Xintong Song
>
> Thanks for your comments!
>
> 1. I think it is a good idea that to align CPU and memory usage with
> FLIP-49 if it will release in version 1.10
> 2. We can update the task executor UI design after FLIP-56 merged into
> master. Actually, the image
> <
> https://cwiki.apache.org/confluence/download/attachments/125309297/BlinkResourceTM.png?version=1&modificationDate=1566223821000&api=v2
> >
> in FLIP-56 is a good UI design, we can follow it in the Flink web.
> 3. No idea about it, maybe anyone famailar with the runtime part could
> answer it? but it would be great to add it to the web UI in my opinion.
> 4. I'm not sure will keyword searching across all the pages may cost too
> many resources in job manager, but I think it would be very useful if the
> REST API could support it.
>
> Best,
> Yadong
>
> Xintong Song <[hidden email]> 于2019年9月29日周日 下午8:11写道:
>
> > Thanks for drafting the FLIP and starting this discussion, Yadong.
> >
> >
> > I have some comments:
> >
> >
> >    - I can see that the proposed memory and cpu usage to be displayed (in
> >    section 1.1) are aligned with the current ResourceProfile fields.
> > However,
> >    we are working on changing the memory fields in 1.10 with FLIP-49
> [1]. I
> >    suggest we align the UI design with the new FLIP-49 memory fields.
> >    - The task executor overview design (in section 1.2) is based on the
> >    current slot model. The coming FLIP-56 [2] which is also planned for
> > 1.10
> >    is changing the model so that task executors no longer have fixed
> > number of
> >    slots, but allocated slots (may have different resources) and
> available
> >    resources.
> >       - I can see that there's discussions in the google doc about using
> >       different color for available resources. However, the resource
> > availability
> >       for different fields can be different, and may not be simply
> > displayed by a
> >       different color. E.g., a task executor may have two slot, while
> slot
> > 1
> >       takes (20% cpu, 10% heap mem, 50% managed mem, etc.), slot 2 takes
> > (10%
> >       cpu,  35% heap mem, 0% managed mem etc.), and the remaining
> > resources in
> >       the task executor are (70% cpu, 55% heap mem, 50% managed mem,
> > etc.). How
> >       do you plan to display that?
> >       - I would suggest to have multiple bars for each task executor,
> while
> >       each bar represents one of the resource fields. In addition, we
> > may have a
> >       number (or some other figures) showing how many slots are allocated
> > from
> >       the task executor.
> >    - Is there any way we provide access to logs of terminated task
> >    executors? It occurs to us a lot that a job failed due to a task
> > executor
> >    fail/lost. And we have to find the logs of failed task executors by
> >    manually accessing the file system. I think it would be helpful if we
> > can
> >    find the logs of failed task executors directly in flink webui.
> >    - Regarding log pagination, is there any way to provide keyword
> >    searching across all the pages?
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> >
> > On Fri, Sep 27, 2019 at 3:57 PM Paul Lam <[hidden email]> wrote:
> >
> > > Filed a jira to track this[1].  Thanks a lot.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-14242 <
> > > https://issues.apache.org/jira/browse/FLINK-14242>
> > >
> > > Best,
> > > Paul Lam
> > >
> > > > 在 2019年9月27日,14:34,Yadong Xie <[hidden email]> 写道:
> > > >
> > > > Hi Paul
> > > > Thanks for your suggestion.
> > > > I think it is easy to implement, could you create a JIRA for me?
> > > >
> > > > Paul Lam <[hidden email]> 于2019年9月27日周五 上午11:11写道:
> > > >
> > > >> Hi Yadong,
> > > >>
> > > >> Thanks a lot for summing up the Web UI efforts.
> > > >>
> > > >> I have a minor suggestion: can we provide a collapse button for the
> > task
> > > >> names in job graph visualization? For some complex jobs, especially
> > SQL
> > > >> jobs, the task names are quite long which makes the job graph hard
> to
> > > read.
> > > >>
> > > >> Best,
> > > >> Paul Lam
> > > >>
> > > >>> 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
> > > >>>
> > > >>> Hi all
> > > >>>
> > > >>> Flink Web UI is the main platform for most users to monitor their
> > jobs
> > > >> and
> > > >>> clusters. We have reconstructed Flink web in 1.9.0 version, but
> there
> > > are
> > > >>> still some shortcomings.
> > > >>>
> > > >>> This discussion thread aims to provide a better experience for
> Flink
> > UI
> > > >>> users.
> > > >>>
> > > >>> Here is the design doc I drafted:
> > > >>>
> > > >>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > >>>
> > > >>>
> > > >>> The FLIP can be found at [2].
> > > >>>
> > > >>> Please keep the discussion here, in the mailing list.
> > > >>>
> > > >>> Looking forward to your opinions, any feedbacks are welcome.
> > > >>>
> > > >>> [1]:
> > > >>>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > >>> <
> > > >>
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > >>>
> > > >>> [2]:
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > >>
> > > >>
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Xintong Song
@Yadong

2. I agree that we can update the task executor ui after flip-56 is done.
But I would suggest keep it on discussion to come up with a proper ui
design for task executor resources. I don't think the mentioned image from
flip-56 is a good choice. That image is a simplified figure with cpu and
total memory only, for the purpose of demonstrating dynamic slot
allocation. In fact, there are 6 fields to be displayed (cpu, task heap,
task off-heap, shuffle, on-heap managed, off-heap managed). If we display
cpu and total memory only, then user will be confused when seeing a task
executor with enough remaining resources but tasks cannot be deployed onto
it (because the desired type of memory might be used up).

4. I've been using blink webui, which already have log pagination. It's
quite common that we need do search for some keywords (e.g., exception,
error, warning) from a large amount of logs for diagnosing problems. I find
it very inconvenient that I have to click into each page searching for the
keywords, and I'd rather take the effort to find the original log files
from the filesystem to view the log. Personally speaking, if the keyword
searching cannot be supported, I would prefer to take some time loading the
non-paginated logs over than paginated ones. Or we may at least have a
button on the webui for switching between the two alternatives.

@Till

Thanks for the inputs.

Thank you~

Xintong Song



On Mon, Sep 30, 2019 at 5:55 PM Till Rohrmann <[hidden email]> wrote:

> For 3. At the moment the log and stdout file serving requires the
> TaskExecutor to be running. But in some scenarios when having a NFS, it
> should be enough to know where the file is located. However, this
> assumption does not hold in the general case.
>
> Cheers,
> Till
>
> On Mon, Sep 30, 2019 at 11:43 AM Yadong Xie <[hidden email]> wrote:
>
> > Hi Xintong Song
> >
> > Thanks for your comments!
> >
> > 1. I think it is a good idea that to align CPU and memory usage with
> > FLIP-49 if it will release in version 1.10
> > 2. We can update the task executor UI design after FLIP-56 merged into
> > master. Actually, the image
> > <
> >
> https://cwiki.apache.org/confluence/download/attachments/125309297/BlinkResourceTM.png?version=1&modificationDate=1566223821000&api=v2
> > >
> > in FLIP-56 is a good UI design, we can follow it in the Flink web.
> > 3. No idea about it, maybe anyone famailar with the runtime part could
> > answer it? but it would be great to add it to the web UI in my opinion.
> > 4. I'm not sure will keyword searching across all the pages may cost too
> > many resources in job manager, but I think it would be very useful if the
> > REST API could support it.
> >
> > Best,
> > Yadong
> >
> > Xintong Song <[hidden email]> 于2019年9月29日周日 下午8:11写道:
> >
> > > Thanks for drafting the FLIP and starting this discussion, Yadong.
> > >
> > >
> > > I have some comments:
> > >
> > >
> > >    - I can see that the proposed memory and cpu usage to be displayed
> (in
> > >    section 1.1) are aligned with the current ResourceProfile fields.
> > > However,
> > >    we are working on changing the memory fields in 1.10 with FLIP-49
> > [1]. I
> > >    suggest we align the UI design with the new FLIP-49 memory fields.
> > >    - The task executor overview design (in section 1.2) is based on the
> > >    current slot model. The coming FLIP-56 [2] which is also planned for
> > > 1.10
> > >    is changing the model so that task executors no longer have fixed
> > > number of
> > >    slots, but allocated slots (may have different resources) and
> > available
> > >    resources.
> > >       - I can see that there's discussions in the google doc about
> using
> > >       different color for available resources. However, the resource
> > > availability
> > >       for different fields can be different, and may not be simply
> > > displayed by a
> > >       different color. E.g., a task executor may have two slot, while
> > slot
> > > 1
> > >       takes (20% cpu, 10% heap mem, 50% managed mem, etc.), slot 2
> takes
> > > (10%
> > >       cpu,  35% heap mem, 0% managed mem etc.), and the remaining
> > > resources in
> > >       the task executor are (70% cpu, 55% heap mem, 50% managed mem,
> > > etc.). How
> > >       do you plan to display that?
> > >       - I would suggest to have multiple bars for each task executor,
> > while
> > >       each bar represents one of the resource fields. In addition, we
> > > may have a
> > >       number (or some other figures) showing how many slots are
> allocated
> > > from
> > >       the task executor.
> > >    - Is there any way we provide access to logs of terminated task
> > >    executors? It occurs to us a lot that a job failed due to a task
> > > executor
> > >    fail/lost. And we have to find the logs of failed task executors by
> > >    manually accessing the file system. I think it would be helpful if
> we
> > > can
> > >    find the logs of failed task executors directly in flink webui.
> > >    - Regarding log pagination, is there any way to provide keyword
> > >    searching across all the pages?
> > >
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> > >
> > > On Fri, Sep 27, 2019 at 3:57 PM Paul Lam <[hidden email]>
> wrote:
> > >
> > > > Filed a jira to track this[1].  Thanks a lot.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/FLINK-14242 <
> > > > https://issues.apache.org/jira/browse/FLINK-14242>
> > > >
> > > > Best,
> > > > Paul Lam
> > > >
> > > > > 在 2019年9月27日,14:34,Yadong Xie <[hidden email]> 写道:
> > > > >
> > > > > Hi Paul
> > > > > Thanks for your suggestion.
> > > > > I think it is easy to implement, could you create a JIRA for me?
> > > > >
> > > > > Paul Lam <[hidden email]> 于2019年9月27日周五 上午11:11写道:
> > > > >
> > > > >> Hi Yadong,
> > > > >>
> > > > >> Thanks a lot for summing up the Web UI efforts.
> > > > >>
> > > > >> I have a minor suggestion: can we provide a collapse button for
> the
> > > task
> > > > >> names in job graph visualization? For some complex jobs,
> especially
> > > SQL
> > > > >> jobs, the task names are quite long which makes the job graph hard
> > to
> > > > read.
> > > > >>
> > > > >> Best,
> > > > >> Paul Lam
> > > > >>
> > > > >>> 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
> > > > >>>
> > > > >>> Hi all
> > > > >>>
> > > > >>> Flink Web UI is the main platform for most users to monitor their
> > > jobs
> > > > >> and
> > > > >>> clusters. We have reconstructed Flink web in 1.9.0 version, but
> > there
> > > > are
> > > > >>> still some shortcomings.
> > > > >>>
> > > > >>> This discussion thread aims to provide a better experience for
> > Flink
> > > UI
> > > > >>> users.
> > > > >>>
> > > > >>> Here is the design doc I drafted:
> > > > >>>
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > >>>
> > > > >>>
> > > > >>> The FLIP can be found at [2].
> > > > >>>
> > > > >>> Please keep the discussion here, in the mailing list.
> > > > >>>
> > > > >>> Looking forward to your opinions, any feedbacks are welcome.
> > > > >>>
> > > > >>> [1]:
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > >>> <
> > > > >>
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > >>>
> > > > >>> [2]:
> > > > >>>
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Yadong Xie
Hi Xintong Song

2. We could switch between the detailed mode(including cpu, task heap,
task off-heap, shuffle, on-heap managed, off-heap managed) and the summary
mode(only including cpu and mem), which is very easy to do in UI design.

4. I think the key point is not pagination in Web UI but the REST API will
totally *break* without pagination in current design mode.
In my opinion, pagination is better than nothing, the pagination is a
solution to keep log API work, and it would be great if there is another
way to keep it work with huge log data.

Xintong Song <[hidden email]> 于2019年9月30日周一 下午7:19写道:

> @Yadong
>
> 2. I agree that we can update the task executor ui after flip-56 is done.
> But I would suggest keep it on discussion to come up with a proper ui
> design for task executor resources. I don't think the mentioned image from
> flip-56 is a good choice. That image is a simplified figure with cpu and
> total memory only, for the purpose of demonstrating dynamic slot
> allocation. In fact, there are 6 fields to be displayed (cpu, task heap,
> task off-heap, shuffle, on-heap managed, off-heap managed). If we display
> cpu and total memory only, then user will be confused when seeing a task
> executor with enough remaining resources but tasks cannot be deployed onto
> it (because the desired type of memory might be used up).
>
> 4. I've been using blink webui, which already have log pagination. It's
> quite common that we need do search for some keywords (e.g., exception,
> error, warning) from a large amount of logs for diagnosing problems. I find
> it very inconvenient that I have to click into each page searching for the
> keywords, and I'd rather take the effort to find the original log files
> from the filesystem to view the log. Personally speaking, if the keyword
> searching cannot be supported, I would prefer to take some time loading the
> non-paginated logs over than paginated ones. Or we may at least have a
> button on the webui for switching between the two alternatives.
>
> @Till
>
> Thanks for the inputs.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Sep 30, 2019 at 5:55 PM Till Rohrmann <[hidden email]>
> wrote:
>
> > For 3. At the moment the log and stdout file serving requires the
> > TaskExecutor to be running. But in some scenarios when having a NFS, it
> > should be enough to know where the file is located. However, this
> > assumption does not hold in the general case.
> >
> > Cheers,
> > Till
> >
> > On Mon, Sep 30, 2019 at 11:43 AM Yadong Xie <[hidden email]> wrote:
> >
> > > Hi Xintong Song
> > >
> > > Thanks for your comments!
> > >
> > > 1. I think it is a good idea that to align CPU and memory usage with
> > > FLIP-49 if it will release in version 1.10
> > > 2. We can update the task executor UI design after FLIP-56 merged into
> > > master. Actually, the image
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/download/attachments/125309297/BlinkResourceTM.png?version=1&modificationDate=1566223821000&api=v2
> > > >
> > > in FLIP-56 is a good UI design, we can follow it in the Flink web.
> > > 3. No idea about it, maybe anyone famailar with the runtime part could
> > > answer it? but it would be great to add it to the web UI in my opinion.
> > > 4. I'm not sure will keyword searching across all the pages may cost
> too
> > > many resources in job manager, but I think it would be very useful if
> the
> > > REST API could support it.
> > >
> > > Best,
> > > Yadong
> > >
> > > Xintong Song <[hidden email]> 于2019年9月29日周日 下午8:11写道:
> > >
> > > > Thanks for drafting the FLIP and starting this discussion, Yadong.
> > > >
> > > >
> > > > I have some comments:
> > > >
> > > >
> > > >    - I can see that the proposed memory and cpu usage to be displayed
> > (in
> > > >    section 1.1) are aligned with the current ResourceProfile fields.
> > > > However,
> > > >    we are working on changing the memory fields in 1.10 with FLIP-49
> > > [1]. I
> > > >    suggest we align the UI design with the new FLIP-49 memory fields.
> > > >    - The task executor overview design (in section 1.2) is based on
> the
> > > >    current slot model. The coming FLIP-56 [2] which is also planned
> for
> > > > 1.10
> > > >    is changing the model so that task executors no longer have fixed
> > > > number of
> > > >    slots, but allocated slots (may have different resources) and
> > > available
> > > >    resources.
> > > >       - I can see that there's discussions in the google doc about
> > using
> > > >       different color for available resources. However, the resource
> > > > availability
> > > >       for different fields can be different, and may not be simply
> > > > displayed by a
> > > >       different color. E.g., a task executor may have two slot, while
> > > slot
> > > > 1
> > > >       takes (20% cpu, 10% heap mem, 50% managed mem, etc.), slot 2
> > takes
> > > > (10%
> > > >       cpu,  35% heap mem, 0% managed mem etc.), and the remaining
> > > > resources in
> > > >       the task executor are (70% cpu, 55% heap mem, 50% managed mem,
> > > > etc.). How
> > > >       do you plan to display that?
> > > >       - I would suggest to have multiple bars for each task executor,
> > > while
> > > >       each bar represents one of the resource fields. In addition, we
> > > > may have a
> > > >       number (or some other figures) showing how many slots are
> > allocated
> > > > from
> > > >       the task executor.
> > > >    - Is there any way we provide access to logs of terminated task
> > > >    executors? It occurs to us a lot that a job failed due to a task
> > > > executor
> > > >    fail/lost. And we have to find the logs of failed task executors
> by
> > > >    manually accessing the file system. I think it would be helpful if
> > we
> > > > can
> > > >    find the logs of failed task executors directly in flink webui.
> > > >    - Regarding log pagination, is there any way to provide keyword
> > > >    searching across all the pages?
> > > >
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> > > >
> > > > On Fri, Sep 27, 2019 at 3:57 PM Paul Lam <[hidden email]>
> > wrote:
> > > >
> > > > > Filed a jira to track this[1].  Thanks a lot.
> > > > >
> > > > > [1] https://issues.apache.org/jira/browse/FLINK-14242 <
> > > > > https://issues.apache.org/jira/browse/FLINK-14242>
> > > > >
> > > > > Best,
> > > > > Paul Lam
> > > > >
> > > > > > 在 2019年9月27日,14:34,Yadong Xie <[hidden email]> 写道:
> > > > > >
> > > > > > Hi Paul
> > > > > > Thanks for your suggestion.
> > > > > > I think it is easy to implement, could you create a JIRA for me?
> > > > > >
> > > > > > Paul Lam <[hidden email]> 于2019年9月27日周五 上午11:11写道:
> > > > > >
> > > > > >> Hi Yadong,
> > > > > >>
> > > > > >> Thanks a lot for summing up the Web UI efforts.
> > > > > >>
> > > > > >> I have a minor suggestion: can we provide a collapse button for
> > the
> > > > task
> > > > > >> names in job graph visualization? For some complex jobs,
> > especially
> > > > SQL
> > > > > >> jobs, the task names are quite long which makes the job graph
> hard
> > > to
> > > > > read.
> > > > > >>
> > > > > >> Best,
> > > > > >> Paul Lam
> > > > > >>
> > > > > >>> 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
> > > > > >>>
> > > > > >>> Hi all
> > > > > >>>
> > > > > >>> Flink Web UI is the main platform for most users to monitor
> their
> > > > jobs
> > > > > >> and
> > > > > >>> clusters. We have reconstructed Flink web in 1.9.0 version, but
> > > there
> > > > > are
> > > > > >>> still some shortcomings.
> > > > > >>>
> > > > > >>> This discussion thread aims to provide a better experience for
> > > Flink
> > > > UI
> > > > > >>> users.
> > > > > >>>
> > > > > >>> Here is the design doc I drafted:
> > > > > >>>
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > >>>
> > > > > >>>
> > > > > >>> The FLIP can be found at [2].
> > > > > >>>
> > > > > >>> Please keep the discussion here, in the mailing list.
> > > > > >>>
> > > > > >>> Looking forward to your opinions, any feedbacks are welcome.
> > > > > >>>
> > > > > >>> [1]:
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > >>> <
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > >>>
> > > > > >>> [2]:
> > > > > >>>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > > >>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Xintong Song
2. Sounds good to me.

4. If that is the case, I would suggest to make a large default page size,
so incase of huge log data we have less large pages rather than lots of
small pages.

Thank you~

Xintong Song



On Tue, Oct 8, 2019 at 11:03 AM Yadong Xie <[hidden email]> wrote:

> Hi Xintong Song
>
> 2. We could switch between the detailed mode(including cpu, task heap,
> task off-heap, shuffle, on-heap managed, off-heap managed) and the summary
> mode(only including cpu and mem), which is very easy to do in UI design.
>
> 4. I think the key point is not pagination in Web UI but the REST API will
> totally *break* without pagination in current design mode.
> In my opinion, pagination is better than nothing, the pagination is a
> solution to keep log API work, and it would be great if there is another
> way to keep it work with huge log data.
>
> Xintong Song <[hidden email]> 于2019年9月30日周一 下午7:19写道:
>
> > @Yadong
> >
> > 2. I agree that we can update the task executor ui after flip-56 is done.
> > But I would suggest keep it on discussion to come up with a proper ui
> > design for task executor resources. I don't think the mentioned image
> from
> > flip-56 is a good choice. That image is a simplified figure with cpu and
> > total memory only, for the purpose of demonstrating dynamic slot
> > allocation. In fact, there are 6 fields to be displayed (cpu, task heap,
> > task off-heap, shuffle, on-heap managed, off-heap managed). If we display
> > cpu and total memory only, then user will be confused when seeing a task
> > executor with enough remaining resources but tasks cannot be deployed
> onto
> > it (because the desired type of memory might be used up).
> >
> > 4. I've been using blink webui, which already have log pagination. It's
> > quite common that we need do search for some keywords (e.g., exception,
> > error, warning) from a large amount of logs for diagnosing problems. I
> find
> > it very inconvenient that I have to click into each page searching for
> the
> > keywords, and I'd rather take the effort to find the original log files
> > from the filesystem to view the log. Personally speaking, if the keyword
> > searching cannot be supported, I would prefer to take some time loading
> the
> > non-paginated logs over than paginated ones. Or we may at least have a
> > button on the webui for switching between the two alternatives.
> >
> > @Till
> >
> > Thanks for the inputs.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Mon, Sep 30, 2019 at 5:55 PM Till Rohrmann <[hidden email]>
> > wrote:
> >
> > > For 3. At the moment the log and stdout file serving requires the
> > > TaskExecutor to be running. But in some scenarios when having a NFS, it
> > > should be enough to know where the file is located. However, this
> > > assumption does not hold in the general case.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Sep 30, 2019 at 11:43 AM Yadong Xie <[hidden email]>
> wrote:
> > >
> > > > Hi Xintong Song
> > > >
> > > > Thanks for your comments!
> > > >
> > > > 1. I think it is a good idea that to align CPU and memory usage with
> > > > FLIP-49 if it will release in version 1.10
> > > > 2. We can update the task executor UI design after FLIP-56 merged
> into
> > > > master. Actually, the image
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/download/attachments/125309297/BlinkResourceTM.png?version=1&modificationDate=1566223821000&api=v2
> > > > >
> > > > in FLIP-56 is a good UI design, we can follow it in the Flink web.
> > > > 3. No idea about it, maybe anyone famailar with the runtime part
> could
> > > > answer it? but it would be great to add it to the web UI in my
> opinion.
> > > > 4. I'm not sure will keyword searching across all the pages may cost
> > too
> > > > many resources in job manager, but I think it would be very useful if
> > the
> > > > REST API could support it.
> > > >
> > > > Best,
> > > > Yadong
> > > >
> > > > Xintong Song <[hidden email]> 于2019年9月29日周日 下午8:11写道:
> > > >
> > > > > Thanks for drafting the FLIP and starting this discussion, Yadong.
> > > > >
> > > > >
> > > > > I have some comments:
> > > > >
> > > > >
> > > > >    - I can see that the proposed memory and cpu usage to be
> displayed
> > > (in
> > > > >    section 1.1) are aligned with the current ResourceProfile
> fields.
> > > > > However,
> > > > >    we are working on changing the memory fields in 1.10 with
> FLIP-49
> > > > [1]. I
> > > > >    suggest we align the UI design with the new FLIP-49 memory
> fields.
> > > > >    - The task executor overview design (in section 1.2) is based on
> > the
> > > > >    current slot model. The coming FLIP-56 [2] which is also planned
> > for
> > > > > 1.10
> > > > >    is changing the model so that task executors no longer have
> fixed
> > > > > number of
> > > > >    slots, but allocated slots (may have different resources) and
> > > > available
> > > > >    resources.
> > > > >       - I can see that there's discussions in the google doc about
> > > using
> > > > >       different color for available resources. However, the
> resource
> > > > > availability
> > > > >       for different fields can be different, and may not be simply
> > > > > displayed by a
> > > > >       different color. E.g., a task executor may have two slot,
> while
> > > > slot
> > > > > 1
> > > > >       takes (20% cpu, 10% heap mem, 50% managed mem, etc.), slot 2
> > > takes
> > > > > (10%
> > > > >       cpu,  35% heap mem, 0% managed mem etc.), and the remaining
> > > > > resources in
> > > > >       the task executor are (70% cpu, 55% heap mem, 50% managed
> mem,
> > > > > etc.). How
> > > > >       do you plan to display that?
> > > > >       - I would suggest to have multiple bars for each task
> executor,
> > > > while
> > > > >       each bar represents one of the resource fields. In addition,
> we
> > > > > may have a
> > > > >       number (or some other figures) showing how many slots are
> > > allocated
> > > > > from
> > > > >       the task executor.
> > > > >    - Is there any way we provide access to logs of terminated task
> > > > >    executors? It occurs to us a lot that a job failed due to a task
> > > > > executor
> > > > >    fail/lost. And we have to find the logs of failed task executors
> > by
> > > > >    manually accessing the file system. I think it would be helpful
> if
> > > we
> > > > > can
> > > > >    find the logs of failed task executors directly in flink webui.
> > > > >    - Regarding log pagination, is there any way to provide keyword
> > > > >    searching across all the pages?
> > > > >
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-49%3A+Unified+Memory+Configuration+for+TaskExecutors
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-56%3A+Dynamic+Slot+Allocation
> > > > >
> > > > > On Fri, Sep 27, 2019 at 3:57 PM Paul Lam <[hidden email]>
> > > wrote:
> > > > >
> > > > > > Filed a jira to track this[1].  Thanks a lot.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/FLINK-14242 <
> > > > > > https://issues.apache.org/jira/browse/FLINK-14242>
> > > > > >
> > > > > > Best,
> > > > > > Paul Lam
> > > > > >
> > > > > > > 在 2019年9月27日,14:34,Yadong Xie <[hidden email]> 写道:
> > > > > > >
> > > > > > > Hi Paul
> > > > > > > Thanks for your suggestion.
> > > > > > > I think it is easy to implement, could you create a JIRA for
> me?
> > > > > > >
> > > > > > > Paul Lam <[hidden email]> 于2019年9月27日周五 上午11:11写道:
> > > > > > >
> > > > > > >> Hi Yadong,
> > > > > > >>
> > > > > > >> Thanks a lot for summing up the Web UI efforts.
> > > > > > >>
> > > > > > >> I have a minor suggestion: can we provide a collapse button
> for
> > > the
> > > > > task
> > > > > > >> names in job graph visualization? For some complex jobs,
> > > especially
> > > > > SQL
> > > > > > >> jobs, the task names are quite long which makes the job graph
> > hard
> > > > to
> > > > > > read.
> > > > > > >>
> > > > > > >> Best,
> > > > > > >> Paul Lam
> > > > > > >>
> > > > > > >>> 在 2019年9月27日,10:13,Yadong Xie <[hidden email]> 写道:
> > > > > > >>>
> > > > > > >>> Hi all
> > > > > > >>>
> > > > > > >>> Flink Web UI is the main platform for most users to monitor
> > their
> > > > > jobs
> > > > > > >> and
> > > > > > >>> clusters. We have reconstructed Flink web in 1.9.0 version,
> but
> > > > there
> > > > > > are
> > > > > > >>> still some shortcomings.
> > > > > > >>>
> > > > > > >>> This discussion thread aims to provide a better experience
> for
> > > > Flink
> > > > > UI
> > > > > > >>> users.
> > > > > > >>>
> > > > > > >>> Here is the design doc I drafted:
> > > > > > >>>
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> The FLIP can be found at [2].
> > > > > > >>>
> > > > > > >>> Please keep the discussion here, in the mailing list.
> > > > > > >>>
> > > > > > >>> Looking forward to your opinions, any feedbacks are welcome.
> > > > > > >>>
> > > > > > >>> [1]:
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > >>> <
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > >>>
> > > > > > >>> [2]:
> > > > > > >>>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

jing
In reply to this post by Yadong Xie
Hi all, I have updated the backend design in FLIP-75
<https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing>
.

Here are some brief introductions:

   - Add metric for manage memory FLINK-14406
   <https://issues.apache.org/jira/browse/FLINK-14406>.
   - Expose TaskExecutor resource configurations to REST API FLINK-14422
   <https://issues.apache.org/jira/browse/FLINK-14422>.
   - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to show
   TaskManager Resource FLINK-14435
   <https://issues.apache.org/jira/browse/FLINK-14435>.

I will continue to update the rest part of the backend design in the doc,
let's keep discuss here, any feedback is appreciated.

Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:

> Hi all
>
> Flink Web UI is the main platform for most users to monitor their jobs and
> clusters. We have reconstructed Flink web in 1.9.0 version, but there are
> still some shortcomings.
>
> This discussion thread aims to provide a better experience for Flink UI
> users.
>
> Here is the design doc I drafted:
>
>
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
>
>
> The FLIP can be found at [2].
>
> Please keep the discussion here, in the mailing list.
>
> Looking forward to your opinions, any feedbacks are welcome.
>
> [1]:
>
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> <
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> >
> [2]:
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Yadong Xie
Hi everyone

We have spent some time updating the documentation since the last
discussion.

In short, the latest FLIP-75 contains the following proposal(including both
frontend and RestAPI)

   1. Job Level
      - better job backpressure detection
      - load more feature in job exception
      - show attempt history in the subtask
      - show attempt timeline
      - add pending slots
   2. Task Manager Level
      - add more metrics
      - better log display
   3. Job Manager Level
      - add metrics tab
      - better log display

To help everyone better understand the proposal, we spent efforts on making
an online POC <http://101.132.122.69:8081/web/#/overview>.

Now you can compare the difference between the new and old Web/RestAPI (the
link is inside the doc)!

Here is the latest FLIP-75 doc:
https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#

Looking forward to your feedback


Best,
Yadong

lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:

> Hi all, I have updated the backend design in FLIP-75
> <
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> >
> .
>
> Here are some brief introductions:
>
>    - Add metric for manage memory FLINK-14406
>    <https://issues.apache.org/jira/browse/FLINK-14406>.
>    - Expose TaskExecutor resource configurations to REST API FLINK-14422
>    <https://issues.apache.org/jira/browse/FLINK-14422>.
>    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to show
>    TaskManager Resource FLINK-14435
>    <https://issues.apache.org/jira/browse/FLINK-14435>.
>
> I will continue to update the rest part of the backend design in the doc,
> let's keep discuss here, any feedback is appreciated.
>
> Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
>
> > Hi all
> >
> > Flink Web UI is the main platform for most users to monitor their jobs
> and
> > clusters. We have reconstructed Flink web in 1.9.0 version, but there are
> > still some shortcomings.
> >
> > This discussion thread aims to provide a better experience for Flink UI
> > users.
> >
> > Here is the design doc I drafted:
> >
> >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> >
> >
> > The FLIP can be found at [2].
> >
> > Please keep the discussion here, in the mailing list.
> >
> > Looking forward to your opinions, any feedbacks are welcome.
> >
> > [1]:
> >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > <
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > >
> > [2]:
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Till Rohrmann
Thanks for the update Yadong. Big +1 for the proposed improvements for
Flink's web UI. I think they will be super helpful for our users.

Cheers,
Till

On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <[hidden email]> wrote:

> Hi everyone
>
> We have spent some time updating the documentation since the last
> discussion.
>
> In short, the latest FLIP-75 contains the following proposal(including both
> frontend and RestAPI)
>
>    1. Job Level
>       - better job backpressure detection
>       - load more feature in job exception
>       - show attempt history in the subtask
>       - show attempt timeline
>       - add pending slots
>    2. Task Manager Level
>       - add more metrics
>       - better log display
>    3. Job Manager Level
>       - add metrics tab
>       - better log display
>
> To help everyone better understand the proposal, we spent efforts on making
> an online POC <http://101.132.122.69:8081/web/#/overview>.
>
> Now you can compare the difference between the new and old Web/RestAPI (the
> link is inside the doc)!
>
> Here is the latest FLIP-75 doc:
>
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
>
> Looking forward to your feedback
>
>
> Best,
> Yadong
>
> lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:
>
> > Hi all, I have updated the backend design in FLIP-75
> > <
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > >
> > .
> >
> > Here are some brief introductions:
> >
> >    - Add metric for manage memory FLINK-14406
> >    <https://issues.apache.org/jira/browse/FLINK-14406>.
> >    - Expose TaskExecutor resource configurations to REST API FLINK-14422
> >    <https://issues.apache.org/jira/browse/FLINK-14422>.
> >    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to show
> >    TaskManager Resource FLINK-14435
> >    <https://issues.apache.org/jira/browse/FLINK-14435>.
> >
> > I will continue to update the rest part of the backend design in the doc,
> > let's keep discuss here, any feedback is appreciated.
> >
> > Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
> >
> > > Hi all
> > >
> > > Flink Web UI is the main platform for most users to monitor their jobs
> > and
> > > clusters. We have reconstructed Flink web in 1.9.0 version, but there
> are
> > > still some shortcomings.
> > >
> > > This discussion thread aims to provide a better experience for Flink UI
> > > users.
> > >
> > > Here is the design doc I drafted:
> > >
> > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > >
> > >
> > > The FLIP can be found at [2].
> > >
> > > Please keep the discussion here, in the mailing list.
> > >
> > > Looking forward to your opinions, any feedbacks are welcome.
> > >
> > > [1]:
> > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > <
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > >
> > > [2]:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Robert Metzger
Thanks a lot for this work! I believe the web UI is very important, in
particular to new users. I'm very happy to see that you are putting effort
into improving the visibility into Flink through the proposed changes.

I can not judge if all the changes make total sense, but the discussion has
been open since September, and a good number of people have commented in
the document.
I wonder if we can move this FLIP to the VOTing stage?

On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <[hidden email]> wrote:

> Thanks for the update Yadong. Big +1 for the proposed improvements for
> Flink's web UI. I think they will be super helpful for our users.
>
> Cheers,
> Till
>
> On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <[hidden email]> wrote:
>
> > Hi everyone
> >
> > We have spent some time updating the documentation since the last
> > discussion.
> >
> > In short, the latest FLIP-75 contains the following proposal(including
> both
> > frontend and RestAPI)
> >
> >    1. Job Level
> >       - better job backpressure detection
> >       - load more feature in job exception
> >       - show attempt history in the subtask
> >       - show attempt timeline
> >       - add pending slots
> >    2. Task Manager Level
> >       - add more metrics
> >       - better log display
> >    3. Job Manager Level
> >       - add metrics tab
> >       - better log display
> >
> > To help everyone better understand the proposal, we spent efforts on
> making
> > an online POC <http://101.132.122.69:8081/web/#/overview>.
> >
> > Now you can compare the difference between the new and old Web/RestAPI
> (the
> > link is inside the doc)!
> >
> > Here is the latest FLIP-75 doc:
> >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> >
> > Looking forward to your feedback
> >
> >
> > Best,
> > Yadong
> >
> > lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:
> >
> > > Hi all, I have updated the backend design in FLIP-75
> > > <
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > >
> > > .
> > >
> > > Here are some brief introductions:
> > >
> > >    - Add metric for manage memory FLINK-14406
> > >    <https://issues.apache.org/jira/browse/FLINK-14406>.
> > >    - Expose TaskExecutor resource configurations to REST API
> FLINK-14422
> > >    <https://issues.apache.org/jira/browse/FLINK-14422>.
> > >    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to show
> > >    TaskManager Resource FLINK-14435
> > >    <https://issues.apache.org/jira/browse/FLINK-14435>.
> > >
> > > I will continue to update the rest part of the backend design in the
> doc,
> > > let's keep discuss here, any feedback is appreciated.
> > >
> > > Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
> > >
> > > > Hi all
> > > >
> > > > Flink Web UI is the main platform for most users to monitor their
> jobs
> > > and
> > > > clusters. We have reconstructed Flink web in 1.9.0 version, but there
> > are
> > > > still some shortcomings.
> > > >
> > > > This discussion thread aims to provide a better experience for Flink
> UI
> > > > users.
> > > >
> > > > Here is the design doc I drafted:
> > > >
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > >
> > > >
> > > > The FLIP can be found at [2].
> > > >
> > > > Please keep the discussion here, in the mailing list.
> > > >
> > > > Looking forward to your opinions, any feedbacks are welcome.
> > > >
> > > > [1]:
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > <
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > >
> > > > [2]:
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Till Rohrmann
Would it be easier if FLIP-75 would be the umbrella FLIP and we would vote
on the individual improvements as sub FLIPs? Decreasing the scope should
make things easier.

Cheers,
Till

On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger <[hidden email]> wrote:

> Thanks a lot for this work! I believe the web UI is very important, in
> particular to new users. I'm very happy to see that you are putting effort
> into improving the visibility into Flink through the proposed changes.
>
> I can not judge if all the changes make total sense, but the discussion has
> been open since September, and a good number of people have commented in
> the document.
> I wonder if we can move this FLIP to the VOTing stage?
>
> On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <[hidden email]>
> wrote:
>
> > Thanks for the update Yadong. Big +1 for the proposed improvements for
> > Flink's web UI. I think they will be super helpful for our users.
> >
> > Cheers,
> > Till
> >
> > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <[hidden email]> wrote:
> >
> > > Hi everyone
> > >
> > > We have spent some time updating the documentation since the last
> > > discussion.
> > >
> > > In short, the latest FLIP-75 contains the following proposal(including
> > both
> > > frontend and RestAPI)
> > >
> > >    1. Job Level
> > >       - better job backpressure detection
> > >       - load more feature in job exception
> > >       - show attempt history in the subtask
> > >       - show attempt timeline
> > >       - add pending slots
> > >    2. Task Manager Level
> > >       - add more metrics
> > >       - better log display
> > >    3. Job Manager Level
> > >       - add metrics tab
> > >       - better log display
> > >
> > > To help everyone better understand the proposal, we spent efforts on
> > making
> > > an online POC <http://101.132.122.69:8081/web/#/overview>.
> > >
> > > Now you can compare the difference between the new and old Web/RestAPI
> > (the
> > > link is inside the doc)!
> > >
> > > Here is the latest FLIP-75 doc:
> > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > >
> > > Looking forward to your feedback
> > >
> > >
> > > Best,
> > > Yadong
> > >
> > > lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:
> > >
> > > > Hi all, I have updated the backend design in FLIP-75
> > > > <
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > >
> > > > .
> > > >
> > > > Here are some brief introductions:
> > > >
> > > >    - Add metric for manage memory FLINK-14406
> > > >    <https://issues.apache.org/jira/browse/FLINK-14406>.
> > > >    - Expose TaskExecutor resource configurations to REST API
> > FLINK-14422
> > > >    <https://issues.apache.org/jira/browse/FLINK-14422>.
> > > >    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to show
> > > >    TaskManager Resource FLINK-14435
> > > >    <https://issues.apache.org/jira/browse/FLINK-14435>.
> > > >
> > > > I will continue to update the rest part of the backend design in the
> > doc,
> > > > let's keep discuss here, any feedback is appreciated.
> > > >
> > > > Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
> > > >
> > > > > Hi all
> > > > >
> > > > > Flink Web UI is the main platform for most users to monitor their
> > jobs
> > > > and
> > > > > clusters. We have reconstructed Flink web in 1.9.0 version, but
> there
> > > are
> > > > > still some shortcomings.
> > > > >
> > > > > This discussion thread aims to provide a better experience for
> Flink
> > UI
> > > > > users.
> > > > >
> > > > > Here is the design doc I drafted:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > >
> > > > >
> > > > > The FLIP can be found at [2].
> > > > >
> > > > > Please keep the discussion here, in the mailing list.
> > > > >
> > > > > Looking forward to your opinions, any feedbacks are welcome.
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > <
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > >
> > > > > [2]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Yadong Xie
Hi Till
I didn’t find how to create of sub flip at cwiki.apache.org
do you mean to create 9 more FLIPS instead of FLIP-75?

Till Rohrmann <[hidden email]> 于2020年1月30日周四 下午11:12写道:

> Would it be easier if FLIP-75 would be the umbrella FLIP and we would vote
> on the individual improvements as sub FLIPs? Decreasing the scope should
> make things easier.
>
> Cheers,
> Till
>
> On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger <[hidden email]>
> wrote:
>
> > Thanks a lot for this work! I believe the web UI is very important, in
> > particular to new users. I'm very happy to see that you are putting
> effort
> > into improving the visibility into Flink through the proposed changes.
> >
> > I can not judge if all the changes make total sense, but the discussion
> has
> > been open since September, and a good number of people have commented in
> > the document.
> > I wonder if we can move this FLIP to the VOTing stage?
> >
> > On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <[hidden email]>
> > wrote:
> >
> > > Thanks for the update Yadong. Big +1 for the proposed improvements for
> > > Flink's web UI. I think they will be super helpful for our users.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <[hidden email]>
> wrote:
> > >
> > > > Hi everyone
> > > >
> > > > We have spent some time updating the documentation since the last
> > > > discussion.
> > > >
> > > > In short, the latest FLIP-75 contains the following
> proposal(including
> > > both
> > > > frontend and RestAPI)
> > > >
> > > >    1. Job Level
> > > >       - better job backpressure detection
> > > >       - load more feature in job exception
> > > >       - show attempt history in the subtask
> > > >       - show attempt timeline
> > > >       - add pending slots
> > > >    2. Task Manager Level
> > > >       - add more metrics
> > > >       - better log display
> > > >    3. Job Manager Level
> > > >       - add metrics tab
> > > >       - better log display
> > > >
> > > > To help everyone better understand the proposal, we spent efforts on
> > > making
> > > > an online POC <http://101.132.122.69:8081/web/#/overview>.
> > > >
> > > > Now you can compare the difference between the new and old
> Web/RestAPI
> > > (the
> > > > link is inside the doc)!
> > > >
> > > > Here is the latest FLIP-75 doc:
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > >
> > > > Looking forward to your feedback
> > > >
> > > >
> > > > Best,
> > > > Yadong
> > > >
> > > > lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:
> > > >
> > > > > Hi all, I have updated the backend design in FLIP-75
> > > > > <
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > >
> > > > > .
> > > > >
> > > > > Here are some brief introductions:
> > > > >
> > > > >    - Add metric for manage memory FLINK-14406
> > > > >    <https://issues.apache.org/jira/browse/FLINK-14406>.
> > > > >    - Expose TaskExecutor resource configurations to REST API
> > > FLINK-14422
> > > > >    <https://issues.apache.org/jira/browse/FLINK-14422>.
> > > > >    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to show
> > > > >    TaskManager Resource FLINK-14435
> > > > >    <https://issues.apache.org/jira/browse/FLINK-14435>.
> > > > >
> > > > > I will continue to update the rest part of the backend design in
> the
> > > doc,
> > > > > let's keep discuss here, any feedback is appreciated.
> > > > >
> > > > > Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
> > > > >
> > > > > > Hi all
> > > > > >
> > > > > > Flink Web UI is the main platform for most users to monitor their
> > > jobs
> > > > > and
> > > > > > clusters. We have reconstructed Flink web in 1.9.0 version, but
> > there
> > > > are
> > > > > > still some shortcomings.
> > > > > >
> > > > > > This discussion thread aims to provide a better experience for
> > Flink
> > > UI
> > > > > > users.
> > > > > >
> > > > > > Here is the design doc I drafted:
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > >
> > > > > >
> > > > > > The FLIP can be found at [2].
> > > > > >
> > > > > > Please keep the discussion here, in the mailing list.
> > > > > >
> > > > > > Looking forward to your opinions, any feedbacks are welcome.
> > > > > >
> > > > > > [1]:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > >
> > > > > > [2]:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Till Rohrmann
I think there is no such description because we never did it before. I just
figured that FLIP-75 could actually be a good candidate to start this
practice. We would need a community discussion first, though.

Cheers,
Till

On Mon, Feb 3, 2020 at 10:28 AM Yadong Xie <[hidden email]> wrote:

> Hi Till
> I didn’t find how to create of sub flip at cwiki.apache.org
> do you mean to create 9 more FLIPS instead of FLIP-75?
>
> Till Rohrmann <[hidden email]> 于2020年1月30日周四 下午11:12写道:
>
> > Would it be easier if FLIP-75 would be the umbrella FLIP and we would
> vote
> > on the individual improvements as sub FLIPs? Decreasing the scope should
> > make things easier.
> >
> > Cheers,
> > Till
> >
> > On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger <[hidden email]>
> > wrote:
> >
> > > Thanks a lot for this work! I believe the web UI is very important, in
> > > particular to new users. I'm very happy to see that you are putting
> > effort
> > > into improving the visibility into Flink through the proposed changes.
> > >
> > > I can not judge if all the changes make total sense, but the discussion
> > has
> > > been open since September, and a good number of people have commented
> in
> > > the document.
> > > I wonder if we can move this FLIP to the VOTing stage?
> > >
> > > On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <[hidden email]>
> > > wrote:
> > >
> > > > Thanks for the update Yadong. Big +1 for the proposed improvements
> for
> > > > Flink's web UI. I think they will be super helpful for our users.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <[hidden email]>
> > wrote:
> > > >
> > > > > Hi everyone
> > > > >
> > > > > We have spent some time updating the documentation since the last
> > > > > discussion.
> > > > >
> > > > > In short, the latest FLIP-75 contains the following
> > proposal(including
> > > > both
> > > > > frontend and RestAPI)
> > > > >
> > > > >    1. Job Level
> > > > >       - better job backpressure detection
> > > > >       - load more feature in job exception
> > > > >       - show attempt history in the subtask
> > > > >       - show attempt timeline
> > > > >       - add pending slots
> > > > >    2. Task Manager Level
> > > > >       - add more metrics
> > > > >       - better log display
> > > > >    3. Job Manager Level
> > > > >       - add metrics tab
> > > > >       - better log display
> > > > >
> > > > > To help everyone better understand the proposal, we spent efforts
> on
> > > > making
> > > > > an online POC <http://101.132.122.69:8081/web/#/overview>.
> > > > >
> > > > > Now you can compare the difference between the new and old
> > Web/RestAPI
> > > > (the
> > > > > link is inside the doc)!
> > > > >
> > > > > Here is the latest FLIP-75 doc:
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > >
> > > > > Looking forward to your feedback
> > > > >
> > > > >
> > > > > Best,
> > > > > Yadong
> > > > >
> > > > > lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:
> > > > >
> > > > > > Hi all, I have updated the backend design in FLIP-75
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > >
> > > > > > .
> > > > > >
> > > > > > Here are some brief introductions:
> > > > > >
> > > > > >    - Add metric for manage memory FLINK-14406
> > > > > >    <https://issues.apache.org/jira/browse/FLINK-14406>.
> > > > > >    - Expose TaskExecutor resource configurations to REST API
> > > > FLINK-14422
> > > > > >    <https://issues.apache.org/jira/browse/FLINK-14422>.
> > > > > >    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to
> show
> > > > > >    TaskManager Resource FLINK-14435
> > > > > >    <https://issues.apache.org/jira/browse/FLINK-14435>.
> > > > > >
> > > > > > I will continue to update the rest part of the backend design in
> > the
> > > > doc,
> > > > > > let's keep discuss here, any feedback is appreciated.
> > > > > >
> > > > > > Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
> > > > > >
> > > > > > > Hi all
> > > > > > >
> > > > > > > Flink Web UI is the main platform for most users to monitor
> their
> > > > jobs
> > > > > > and
> > > > > > > clusters. We have reconstructed Flink web in 1.9.0 version, but
> > > there
> > > > > are
> > > > > > > still some shortcomings.
> > > > > > >
> > > > > > > This discussion thread aims to provide a better experience for
> > > Flink
> > > > UI
> > > > > > > users.
> > > > > > >
> > > > > > > Here is the design doc I drafted:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > >
> > > > > > >
> > > > > > > The FLIP can be found at [2].
> > > > > > >
> > > > > > > Please keep the discussion here, in the mailing list.
> > > > > > >
> > > > > > > Looking forward to your opinions, any feedbacks are welcome.
> > > > > > >
> > > > > > > [1]:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > > >
> > > > > > > [2]:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Yadong Xie
Hi Till

FLIP-75 has been open since September, and the design doc has been iterated
over 3 versions and more than 20 patches.
I had a try, but it is hard to split the design docs into sub FLIP and keep
all the discussion history at the same time.

Maybe it is better to start another discussion to talk about the individual
sub FLIP voting? and make the next FLIP follow the new practice if possible.

Till Rohrmann <[hidden email]> 于2020年2月3日周一 下午6:28写道:

> I think there is no such description because we never did it before. I just
> figured that FLIP-75 could actually be a good candidate to start this
> practice. We would need a community discussion first, though.
>
> Cheers,
> Till
>
> On Mon, Feb 3, 2020 at 10:28 AM Yadong Xie <[hidden email]> wrote:
>
> > Hi Till
> > I didn’t find how to create of sub flip at cwiki.apache.org
> > do you mean to create 9 more FLIPS instead of FLIP-75?
> >
> > Till Rohrmann <[hidden email]> 于2020年1月30日周四 下午11:12写道:
> >
> > > Would it be easier if FLIP-75 would be the umbrella FLIP and we would
> > vote
> > > on the individual improvements as sub FLIPs? Decreasing the scope
> should
> > > make things easier.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger <[hidden email]>
> > > wrote:
> > >
> > > > Thanks a lot for this work! I believe the web UI is very important,
> in
> > > > particular to new users. I'm very happy to see that you are putting
> > > effort
> > > > into improving the visibility into Flink through the proposed
> changes.
> > > >
> > > > I can not judge if all the changes make total sense, but the
> discussion
> > > has
> > > > been open since September, and a good number of people have commented
> > in
> > > > the document.
> > > > I wonder if we can move this FLIP to the VOTing stage?
> > > >
> > > > On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <[hidden email]>
> > > > wrote:
> > > >
> > > > > Thanks for the update Yadong. Big +1 for the proposed improvements
> > for
> > > > > Flink's web UI. I think they will be super helpful for our users.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <[hidden email]>
> > > wrote:
> > > > >
> > > > > > Hi everyone
> > > > > >
> > > > > > We have spent some time updating the documentation since the last
> > > > > > discussion.
> > > > > >
> > > > > > In short, the latest FLIP-75 contains the following
> > > proposal(including
> > > > > both
> > > > > > frontend and RestAPI)
> > > > > >
> > > > > >    1. Job Level
> > > > > >       - better job backpressure detection
> > > > > >       - load more feature in job exception
> > > > > >       - show attempt history in the subtask
> > > > > >       - show attempt timeline
> > > > > >       - add pending slots
> > > > > >    2. Task Manager Level
> > > > > >       - add more metrics
> > > > > >       - better log display
> > > > > >    3. Job Manager Level
> > > > > >       - add metrics tab
> > > > > >       - better log display
> > > > > >
> > > > > > To help everyone better understand the proposal, we spent efforts
> > on
> > > > > making
> > > > > > an online POC <http://101.132.122.69:8081/web/#/overview>.
> > > > > >
> > > > > > Now you can compare the difference between the new and old
> > > Web/RestAPI
> > > > > (the
> > > > > > link is inside the doc)!
> > > > > >
> > > > > > Here is the latest FLIP-75 doc:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > >
> > > > > > Looking forward to your feedback
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yadong
> > > > > >
> > > > > > lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:
> > > > > >
> > > > > > > Hi all, I have updated the backend design in FLIP-75
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > >
> > > > > > > .
> > > > > > >
> > > > > > > Here are some brief introductions:
> > > > > > >
> > > > > > >    - Add metric for manage memory FLINK-14406
> > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14406>.
> > > > > > >    - Expose TaskExecutor resource configurations to REST API
> > > > > FLINK-14422
> > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14422>.
> > > > > > >    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to
> > show
> > > > > > >    TaskManager Resource FLINK-14435
> > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14435>.
> > > > > > >
> > > > > > > I will continue to update the rest part of the backend design
> in
> > > the
> > > > > doc,
> > > > > > > let's keep discuss here, any feedback is appreciated.
> > > > > > >
> > > > > > > Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
> > > > > > >
> > > > > > > > Hi all
> > > > > > > >
> > > > > > > > Flink Web UI is the main platform for most users to monitor
> > their
> > > > > jobs
> > > > > > > and
> > > > > > > > clusters. We have reconstructed Flink web in 1.9.0 version,
> but
> > > > there
> > > > > > are
> > > > > > > > still some shortcomings.
> > > > > > > >
> > > > > > > > This discussion thread aims to provide a better experience
> for
> > > > Flink
> > > > > UI
> > > > > > > > users.
> > > > > > > >
> > > > > > > > Here is the design doc I drafted:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > >
> > > > > > > >
> > > > > > > > The FLIP can be found at [2].
> > > > > > > >
> > > > > > > > Please keep the discussion here, in the mailing list.
> > > > > > > >
> > > > > > > > Looking forward to your opinions, any feedbacks are welcome.
> > > > > > > >
> > > > > > > > [1]:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > > > >
> > > > > > > > [2]:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Till Rohrmann
Hi Yadong,

I think it would be fine to simply link to this discussion thread to keep
the discussion history. Maybe an easier way would be to create top-level
FLIPs for the individual changes proposed in FLIP-75. The reason I'm
proposing this is that it would be easier to vote on it and to implement it
because the scope is smaller. But maybe I'm wrong here and others could
chime in to voice their opinion.

Cheers,
Till

On Fri, Feb 7, 2020 at 9:58 AM Yadong Xie <[hidden email]> wrote:

> Hi Till
>
> FLIP-75 has been open since September, and the design doc has been iterated
> over 3 versions and more than 20 patches.
> I had a try, but it is hard to split the design docs into sub FLIP and keep
> all the discussion history at the same time.
>
> Maybe it is better to start another discussion to talk about the individual
> sub FLIP voting? and make the next FLIP follow the new practice if
> possible.
>
> Till Rohrmann <[hidden email]> 于2020年2月3日周一 下午6:28写道:
>
> > I think there is no such description because we never did it before. I
> just
> > figured that FLIP-75 could actually be a good candidate to start this
> > practice. We would need a community discussion first, though.
> >
> > Cheers,
> > Till
> >
> > On Mon, Feb 3, 2020 at 10:28 AM Yadong Xie <[hidden email]> wrote:
> >
> > > Hi Till
> > > I didn’t find how to create of sub flip at cwiki.apache.org
> > > do you mean to create 9 more FLIPS instead of FLIP-75?
> > >
> > > Till Rohrmann <[hidden email]> 于2020年1月30日周四 下午11:12写道:
> > >
> > > > Would it be easier if FLIP-75 would be the umbrella FLIP and we would
> > > vote
> > > > on the individual improvements as sub FLIPs? Decreasing the scope
> > should
> > > > make things easier.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger <[hidden email]>
> > > > wrote:
> > > >
> > > > > Thanks a lot for this work! I believe the web UI is very important,
> > in
> > > > > particular to new users. I'm very happy to see that you are putting
> > > > effort
> > > > > into improving the visibility into Flink through the proposed
> > changes.
> > > > >
> > > > > I can not judge if all the changes make total sense, but the
> > discussion
> > > > has
> > > > > been open since September, and a good number of people have
> commented
> > > in
> > > > > the document.
> > > > > I wonder if we can move this FLIP to the VOTing stage?
> > > > >
> > > > > On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the update Yadong. Big +1 for the proposed
> improvements
> > > for
> > > > > > Flink's web UI. I think they will be super helpful for our users.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <[hidden email]>
> > > > wrote:
> > > > > >
> > > > > > > Hi everyone
> > > > > > >
> > > > > > > We have spent some time updating the documentation since the
> last
> > > > > > > discussion.
> > > > > > >
> > > > > > > In short, the latest FLIP-75 contains the following
> > > > proposal(including
> > > > > > both
> > > > > > > frontend and RestAPI)
> > > > > > >
> > > > > > >    1. Job Level
> > > > > > >       - better job backpressure detection
> > > > > > >       - load more feature in job exception
> > > > > > >       - show attempt history in the subtask
> > > > > > >       - show attempt timeline
> > > > > > >       - add pending slots
> > > > > > >    2. Task Manager Level
> > > > > > >       - add more metrics
> > > > > > >       - better log display
> > > > > > >    3. Job Manager Level
> > > > > > >       - add metrics tab
> > > > > > >       - better log display
> > > > > > >
> > > > > > > To help everyone better understand the proposal, we spent
> efforts
> > > on
> > > > > > making
> > > > > > > an online POC <http://101.132.122.69:8081/web/#/overview>.
> > > > > > >
> > > > > > > Now you can compare the difference between the new and old
> > > > Web/RestAPI
> > > > > > (the
> > > > > > > link is inside the doc)!
> > > > > > >
> > > > > > > Here is the latest FLIP-75 doc:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > >
> > > > > > > Looking forward to your feedback
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yadong
> > > > > > >
> > > > > > > lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:
> > > > > > >
> > > > > > > > Hi all, I have updated the backend design in FLIP-75
> > > > > > > > <
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > > >
> > > > > > > > .
> > > > > > > >
> > > > > > > > Here are some brief introductions:
> > > > > > > >
> > > > > > > >    - Add metric for manage memory FLINK-14406
> > > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14406>.
> > > > > > > >    - Expose TaskExecutor resource configurations to REST API
> > > > > > FLINK-14422
> > > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14422>.
> > > > > > > >    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo to
> > > show
> > > > > > > >    TaskManager Resource FLINK-14435
> > > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14435>.
> > > > > > > >
> > > > > > > > I will continue to update the rest part of the backend design
> > in
> > > > the
> > > > > > doc,
> > > > > > > > let's keep discuss here, any feedback is appreciated.
> > > > > > > >
> > > > > > > > Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
> > > > > > > >
> > > > > > > > > Hi all
> > > > > > > > >
> > > > > > > > > Flink Web UI is the main platform for most users to monitor
> > > their
> > > > > > jobs
> > > > > > > > and
> > > > > > > > > clusters. We have reconstructed Flink web in 1.9.0 version,
> > but
> > > > > there
> > > > > > > are
> > > > > > > > > still some shortcomings.
> > > > > > > > >
> > > > > > > > > This discussion thread aims to provide a better experience
> > for
> > > > > Flink
> > > > > > UI
> > > > > > > > > users.
> > > > > > > > >
> > > > > > > > > Here is the design doc I drafted:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > The FLIP can be found at [2].
> > > > > > > > >
> > > > > > > > > Please keep the discussion here, in the mailing list.
> > > > > > > > >
> > > > > > > > > Looking forward to your opinions, any feedbacks are
> welcome.
> > > > > > > > >
> > > > > > > > > [1]:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > > > > >
> > > > > > > > > [2]:
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-75: Flink Web UI Improvement Proposal

Yadong Xie
Hi Till
I got your point, will create sub FLIPs and votings according to the
FLIP-75 and previous discussion soon.

Till Rohrmann <[hidden email]> 于2020年2月9日周日 下午5:27写道:

> Hi Yadong,
>
> I think it would be fine to simply link to this discussion thread to keep
> the discussion history. Maybe an easier way would be to create top-level
> FLIPs for the individual changes proposed in FLIP-75. The reason I'm
> proposing this is that it would be easier to vote on it and to implement it
> because the scope is smaller. But maybe I'm wrong here and others could
> chime in to voice their opinion.
>
> Cheers,
> Till
>
> On Fri, Feb 7, 2020 at 9:58 AM Yadong Xie <[hidden email]> wrote:
>
> > Hi Till
> >
> > FLIP-75 has been open since September, and the design doc has been
> iterated
> > over 3 versions and more than 20 patches.
> > I had a try, but it is hard to split the design docs into sub FLIP and
> keep
> > all the discussion history at the same time.
> >
> > Maybe it is better to start another discussion to talk about the
> individual
> > sub FLIP voting? and make the next FLIP follow the new practice if
> > possible.
> >
> > Till Rohrmann <[hidden email]> 于2020年2月3日周一 下午6:28写道:
> >
> > > I think there is no such description because we never did it before. I
> > just
> > > figured that FLIP-75 could actually be a good candidate to start this
> > > practice. We would need a community discussion first, though.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Feb 3, 2020 at 10:28 AM Yadong Xie <[hidden email]>
> wrote:
> > >
> > > > Hi Till
> > > > I didn’t find how to create of sub flip at cwiki.apache.org
> > > > do you mean to create 9 more FLIPS instead of FLIP-75?
> > > >
> > > > Till Rohrmann <[hidden email]> 于2020年1月30日周四 下午11:12写道:
> > > >
> > > > > Would it be easier if FLIP-75 would be the umbrella FLIP and we
> would
> > > > vote
> > > > > on the individual improvements as sub FLIPs? Decreasing the scope
> > > should
> > > > > make things easier.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Thu, Jan 30, 2020 at 2:35 PM Robert Metzger <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Thanks a lot for this work! I believe the web UI is very
> important,
> > > in
> > > > > > particular to new users. I'm very happy to see that you are
> putting
> > > > > effort
> > > > > > into improving the visibility into Flink through the proposed
> > > changes.
> > > > > >
> > > > > > I can not judge if all the changes make total sense, but the
> > > discussion
> > > > > has
> > > > > > been open since September, and a good number of people have
> > commented
> > > > in
> > > > > > the document.
> > > > > > I wonder if we can move this FLIP to the VOTing stage?
> > > > > >
> > > > > > On Wed, Jan 22, 2020 at 6:27 PM Till Rohrmann <
> > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for the update Yadong. Big +1 for the proposed
> > improvements
> > > > for
> > > > > > > Flink's web UI. I think they will be super helpful for our
> users.
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > > On Tue, Jan 7, 2020 at 10:00 AM Yadong Xie <
> [hidden email]>
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone
> > > > > > > >
> > > > > > > > We have spent some time updating the documentation since the
> > last
> > > > > > > > discussion.
> > > > > > > >
> > > > > > > > In short, the latest FLIP-75 contains the following
> > > > > proposal(including
> > > > > > > both
> > > > > > > > frontend and RestAPI)
> > > > > > > >
> > > > > > > >    1. Job Level
> > > > > > > >       - better job backpressure detection
> > > > > > > >       - load more feature in job exception
> > > > > > > >       - show attempt history in the subtask
> > > > > > > >       - show attempt timeline
> > > > > > > >       - add pending slots
> > > > > > > >    2. Task Manager Level
> > > > > > > >       - add more metrics
> > > > > > > >       - better log display
> > > > > > > >    3. Job Manager Level
> > > > > > > >       - add metrics tab
> > > > > > > >       - better log display
> > > > > > > >
> > > > > > > > To help everyone better understand the proposal, we spent
> > efforts
> > > > on
> > > > > > > making
> > > > > > > > an online POC <http://101.132.122.69:8081/web/#/overview>.
> > > > > > > >
> > > > > > > > Now you can compare the difference between the new and old
> > > > > Web/RestAPI
> > > > > > > (the
> > > > > > > > link is inside the doc)!
> > > > > > > >
> > > > > > > > Here is the latest FLIP-75 doc:
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > > >
> > > > > > > > Looking forward to your feedback
> > > > > > > >
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yadong
> > > > > > > >
> > > > > > > > lining jing <[hidden email]> 于2019年10月24日周四 下午2:11写道:
> > > > > > > >
> > > > > > > > > Hi all, I have updated the backend design in FLIP-75
> > > > > > > > > <
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > > > >
> > > > > > > > > .
> > > > > > > > >
> > > > > > > > > Here are some brief introductions:
> > > > > > > > >
> > > > > > > > >    - Add metric for manage memory FLINK-14406
> > > > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14406>.
> > > > > > > > >    - Expose TaskExecutor resource configurations to REST
> API
> > > > > > > FLINK-14422
> > > > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14422>.
> > > > > > > > >    - Add TaskManagerResourceInfo in TaskManagerDetailsInfo
> to
> > > > show
> > > > > > > > >    TaskManager Resource FLINK-14435
> > > > > > > > >    <https://issues.apache.org/jira/browse/FLINK-14435>.
> > > > > > > > >
> > > > > > > > > I will continue to update the rest part of the backend
> design
> > > in
> > > > > the
> > > > > > > doc,
> > > > > > > > > let's keep discuss here, any feedback is appreciated.
> > > > > > > > >
> > > > > > > > > Yadong Xie <[hidden email]> 于2019年9月27日周五 上午10:13写道:
> > > > > > > > >
> > > > > > > > > > Hi all
> > > > > > > > > >
> > > > > > > > > > Flink Web UI is the main platform for most users to
> monitor
> > > > their
> > > > > > > jobs
> > > > > > > > > and
> > > > > > > > > > clusters. We have reconstructed Flink web in 1.9.0
> version,
> > > but
> > > > > > there
> > > > > > > > are
> > > > > > > > > > still some shortcomings.
> > > > > > > > > >
> > > > > > > > > > This discussion thread aims to provide a better
> experience
> > > for
> > > > > > Flink
> > > > > > > UI
> > > > > > > > > > users.
> > > > > > > > > >
> > > > > > > > > > Here is the design doc I drafted:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > The FLIP can be found at [2].
> > > > > > > > > >
> > > > > > > > > > Please keep the discussion here, in the mailing list.
> > > > > > > > > >
> > > > > > > > > > Looking forward to your opinions, any feedbacks are
> > welcome.
> > > > > > > > > >
> > > > > > > > > > [1]:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit?usp=sharing
> > > > > > > > > > <
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1tIa8yN2prWWKJI_fa1u0t6h1r6RJpp56m48pXEyh6iI/edit#
> > > > > > > > > > >
> > > > > > > > > > [2]:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-75%3A+Flink+Web+UI+Improvement+Proposal
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
12