[DISCUSS] Have foreground processes also create log files

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Have foreground processes also create log files

Chesnay Schepler-3
Currently, processes started in the foreground (like in the case of
Docker) output all logging/stdout directly to the console, without
creating any logging files.

The downside of this approach, as outlined in FLIP-111, is that the
WebUI is not able to display the logs since it relies on these very
files to exist.

In FLINK-17166 (part of FLIP-111) we are trying to change this such that
we always created .log/.out files. It seems like a reasonable change to
do, but it could have repercussions on existing deployments since we
will naturally use more disk space (logs gotta go somewhere).

I'm curious what people think about this.

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Yang Wang
Thanks for Chesnay starting this discussion.

In FLINK-17166 implementation[1], we are trying to use "tee" instead of
introducing the stream redirection(redirect the out/err to files). However,
a side effect is that the logging will be duplicated both in .log and .out
files.
Then it may consume more disk space. However it is not a very critical
problem since we could use log4j/logback configuration to control the
rolling
files and max size.

Also, it only happens in docker/K8s deployment. For YARN/Mesos deployment,
the behavior is just same as before.


[1]. https://github.com/apache/flink/pull/11839

Best,
Yang

Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道:

> Currently, processes started in the foreground (like in the case of
> Docker) output all logging/stdout directly to the console, without
> creating any logging files.
>
> The downside of this approach, as outlined in FLIP-111, is that the
> WebUI is not able to display the logs since it relies on these very
> files to exist.
>
> In FLINK-17166 (part of FLIP-111) we are trying to change this such that
> we always created .log/.out files. It seems like a reasonable change to
> do, but it could have repercussions on existing deployments since we
> will naturally use more disk space (logs gotta go somewhere).
>
> I'm curious what people think about this.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Till Rohrmann
Hi everyone,

thanks for starting this discussion Chesnay.

I think it would be nice if we also displayed the logs when starting the
process in the foreground.

The repercussions could be mitigated if the default logger configurations
would contain file rolling with a max log file size.

@Yang I think there are solutions how to redirect stdout and stderr into
separate files using tee without duplication [1].

[1] http://www.softpanorama.org/Tools/tee.shtml

Cheers,
Till

On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> wrote:

> Thanks for Chesnay starting this discussion.
>
> In FLINK-17166 implementation[1], we are trying to use "tee" instead of
> introducing the stream redirection(redirect the out/err to files). However,
> a side effect is that the logging will be duplicated both in .log and .out
> files.
> Then it may consume more disk space. However it is not a very critical
> problem since we could use log4j/logback configuration to control the
> rolling
> files and max size.
>
> Also, it only happens in docker/K8s deployment. For YARN/Mesos deployment,
> the behavior is just same as before.
>
>
> [1]. https://github.com/apache/flink/pull/11839
>
> Best,
> Yang
>
> Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道:
>
> > Currently, processes started in the foreground (like in the case of
> > Docker) output all logging/stdout directly to the console, without
> > creating any logging files.
> >
> > The downside of this approach, as outlined in FLIP-111, is that the
> > WebUI is not able to display the logs since it relies on these very
> > files to exist.
> >
> > In FLINK-17166 (part of FLIP-111) we are trying to change this such that
> > we always created .log/.out files. It seems like a reasonable change to
> > do, but it could have repercussions on existing deployments since we
> > will naturally use more disk space (logs gotta go somewhere).
> >
> > I'm curious what people think about this.
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

David Anderson-3
I like this idea because it should improve the experience (and reduce
confusion) for folks having their first flink experience via one of the
docker playgrounds. Right now it gives the impression that something is
broken out-of-the-box.

Regards,
David

On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]> wrote:

> Hi everyone,
>
> thanks for starting this discussion Chesnay.
>
> I think it would be nice if we also displayed the logs when starting the
> process in the foreground.
>
> The repercussions could be mitigated if the default logger configurations
> would contain file rolling with a max log file size.
>
> @Yang I think there are solutions how to redirect stdout and stderr into
> separate files using tee without duplication [1].
>
> [1] http://www.softpanorama.org/Tools/tee.shtml
>
> Cheers,
> Till
>
> On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> wrote:
>
> > Thanks for Chesnay starting this discussion.
> >
> > In FLINK-17166 implementation[1], we are trying to use "tee" instead of
> > introducing the stream redirection(redirect the out/err to files).
> However,
> > a side effect is that the logging will be duplicated both in .log and
> .out
> > files.
> > Then it may consume more disk space. However it is not a very critical
> > problem since we could use log4j/logback configuration to control the
> > rolling
> > files and max size.
> >
> > Also, it only happens in docker/K8s deployment. For YARN/Mesos
> deployment,
> > the behavior is just same as before.
> >
> >
> > [1]. https://github.com/apache/flink/pull/11839
> >
> > Best,
> > Yang
> >
> > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道:
> >
> > > Currently, processes started in the foreground (like in the case of
> > > Docker) output all logging/stdout directly to the console, without
> > > creating any logging files.
> > >
> > > The downside of this approach, as outlined in FLIP-111, is that the
> > > WebUI is not able to display the logs since it relies on these very
> > > files to exist.
> > >
> > > In FLINK-17166 (part of FLIP-111) we are trying to change this such
> that
> > > we always created .log/.out files. It seems like a reasonable change to
> > > do, but it could have repercussions on existing deployments since we
> > > will naturally use more disk space (logs gotta go somewhere).
> > >
> > > I'm curious what people think about this.
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Stephan Ewen
I think Patrick originally introduced the foreground mode, and I believe it
had indeed something to do with container use and logging.

IIRC the default assumption in docker and Kubernetes is that the logs come
on stdout (or stderr) so after "principle of least astonishment" the idea
was to give a similar experience with Flink.

On Tue, May 5, 2020 at 10:49 AM David Anderson <[hidden email]>
wrote:

> I like this idea because it should improve the experience (and reduce
> confusion) for folks having their first flink experience via one of the
> docker playgrounds. Right now it gives the impression that something is
> broken out-of-the-box.
>
> Regards,
> David
>
> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]> wrote:
>
> > Hi everyone,
> >
> > thanks for starting this discussion Chesnay.
> >
> > I think it would be nice if we also displayed the logs when starting the
> > process in the foreground.
> >
> > The repercussions could be mitigated if the default logger configurations
> > would contain file rolling with a max log file size.
> >
> > @Yang I think there are solutions how to redirect stdout and stderr into
> > separate files using tee without duplication [1].
> >
> > [1] http://www.softpanorama.org/Tools/tee.shtml
> >
> > Cheers,
> > Till
> >
> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> wrote:
> >
> > > Thanks for Chesnay starting this discussion.
> > >
> > > In FLINK-17166 implementation[1], we are trying to use "tee" instead of
> > > introducing the stream redirection(redirect the out/err to files).
> > However,
> > > a side effect is that the logging will be duplicated both in .log and
> > .out
> > > files.
> > > Then it may consume more disk space. However it is not a very critical
> > > problem since we could use log4j/logback configuration to control the
> > > rolling
> > > files and max size.
> > >
> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos
> > deployment,
> > > the behavior is just same as before.
> > >
> > >
> > > [1]. https://github.com/apache/flink/pull/11839
> > >
> > > Best,
> > > Yang
> > >
> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道:
> > >
> > > > Currently, processes started in the foreground (like in the case of
> > > > Docker) output all logging/stdout directly to the console, without
> > > > creating any logging files.
> > > >
> > > > The downside of this approach, as outlined in FLIP-111, is that the
> > > > WebUI is not able to display the logs since it relies on these very
> > > > files to exist.
> > > >
> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this such
> > that
> > > > we always created .log/.out files. It seems like a reasonable change
> to
> > > > do, but it could have repercussions on existing deployments since we
> > > > will naturally use more disk space (logs gotta go somewhere).
> > > >
> > > > I'm curious what people think about this.
> > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Stephan Ewen
@Patrick could you chime in?

We should at least understand the original motivation before simply
changing the way it works.

On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> wrote:

> I think Patrick originally introduced the foreground mode, and I believe
> it had indeed something to do with container use and logging.
>
> IIRC the default assumption in docker and Kubernetes is that the logs come
> on stdout (or stderr) so after "principle of least astonishment" the idea
> was to give a similar experience with Flink.
>
> On Tue, May 5, 2020 at 10:49 AM David Anderson <[hidden email]>
> wrote:
>
>> I like this idea because it should improve the experience (and reduce
>> confusion) for folks having their first flink experience via one of the
>> docker playgrounds. Right now it gives the impression that something is
>> broken out-of-the-box.
>>
>> Regards,
>> David
>>
>> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]>
>> wrote:
>>
>> > Hi everyone,
>> >
>> > thanks for starting this discussion Chesnay.
>> >
>> > I think it would be nice if we also displayed the logs when starting the
>> > process in the foreground.
>> >
>> > The repercussions could be mitigated if the default logger
>> configurations
>> > would contain file rolling with a max log file size.
>> >
>> > @Yang I think there are solutions how to redirect stdout and stderr into
>> > separate files using tee without duplication [1].
>> >
>> > [1] http://www.softpanorama.org/Tools/tee.shtml
>> >
>> > Cheers,
>> > Till
>> >
>> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]>
>> wrote:
>> >
>> > > Thanks for Chesnay starting this discussion.
>> > >
>> > > In FLINK-17166 implementation[1], we are trying to use "tee" instead
>> of
>> > > introducing the stream redirection(redirect the out/err to files).
>> > However,
>> > > a side effect is that the logging will be duplicated both in .log and
>> > .out
>> > > files.
>> > > Then it may consume more disk space. However it is not a very critical
>> > > problem since we could use log4j/logback configuration to control the
>> > > rolling
>> > > files and max size.
>> > >
>> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos
>> > deployment,
>> > > the behavior is just same as before.
>> > >
>> > >
>> > > [1]. https://github.com/apache/flink/pull/11839
>> > >
>> > > Best,
>> > > Yang
>> > >
>> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道:
>> > >
>> > > > Currently, processes started in the foreground (like in the case of
>> > > > Docker) output all logging/stdout directly to the console, without
>> > > > creating any logging files.
>> > > >
>> > > > The downside of this approach, as outlined in FLIP-111, is that the
>> > > > WebUI is not able to display the logs since it relies on these very
>> > > > files to exist.
>> > > >
>> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this such
>> > that
>> > > > we always created .log/.out files. It seems like a reasonable
>> change to
>> > > > do, but it could have repercussions on existing deployments since we
>> > > > will naturally use more disk space (logs gotta go somewhere).
>> > > >
>> > > > I'm curious what people think about this.
>> > > >
>> > > >
>> > >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Arvid Heise-3
From my previous experience with K8s, I'd assume that the cluster itself
already has some ELK attached to it and all stdout/err is collected
automatically.

So if you want to also add logs, I'd make that configurable and I'm torn
what the default should be as both make sense.

However, thinking a bit further, it sounds as if the current way is already
very inconsistent. We should probably either have log files for everything
or log everything to stdout/err and let K8s deal with it. And then we
should have some configuration to toggle between the modes.

On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> wrote:

> @Patrick could you chime in?
>
> We should at least understand the original motivation before simply
> changing the way it works.
>
> On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> wrote:
>
> > I think Patrick originally introduced the foreground mode, and I believe
> > it had indeed something to do with container use and logging.
> >
> > IIRC the default assumption in docker and Kubernetes is that the logs
> come
> > on stdout (or stderr) so after "principle of least astonishment" the idea
> > was to give a similar experience with Flink.
> >
> > On Tue, May 5, 2020 at 10:49 AM David Anderson <[hidden email]>
> > wrote:
> >
> >> I like this idea because it should improve the experience (and reduce
> >> confusion) for folks having their first flink experience via one of the
> >> docker playgrounds. Right now it gives the impression that something is
> >> broken out-of-the-box.
> >>
> >> Regards,
> >> David
> >>
> >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]>
> >> wrote:
> >>
> >> > Hi everyone,
> >> >
> >> > thanks for starting this discussion Chesnay.
> >> >
> >> > I think it would be nice if we also displayed the logs when starting
> the
> >> > process in the foreground.
> >> >
> >> > The repercussions could be mitigated if the default logger
> >> configurations
> >> > would contain file rolling with a max log file size.
> >> >
> >> > @Yang I think there are solutions how to redirect stdout and stderr
> into
> >> > separate files using tee without duplication [1].
> >> >
> >> > [1] http://www.softpanorama.org/Tools/tee.shtml
> >> >
> >> > Cheers,
> >> > Till
> >> >
> >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]>
> >> wrote:
> >> >
> >> > > Thanks for Chesnay starting this discussion.
> >> > >
> >> > > In FLINK-17166 implementation[1], we are trying to use "tee" instead
> >> of
> >> > > introducing the stream redirection(redirect the out/err to files).
> >> > However,
> >> > > a side effect is that the logging will be duplicated both in .log
> and
> >> > .out
> >> > > files.
> >> > > Then it may consume more disk space. However it is not a very
> critical
> >> > > problem since we could use log4j/logback configuration to control
> the
> >> > > rolling
> >> > > files and max size.
> >> > >
> >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos
> >> > deployment,
> >> > > the behavior is just same as before.
> >> > >
> >> > >
> >> > > [1]. https://github.com/apache/flink/pull/11839
> >> > >
> >> > > Best,
> >> > > Yang
> >> > >
> >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道:
> >> > >
> >> > > > Currently, processes started in the foreground (like in the case
> of
> >> > > > Docker) output all logging/stdout directly to the console, without
> >> > > > creating any logging files.
> >> > > >
> >> > > > The downside of this approach, as outlined in FLIP-111, is that
> the
> >> > > > WebUI is not able to display the logs since it relies on these
> very
> >> > > > files to exist.
> >> > > >
> >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this
> such
> >> > that
> >> > > > we always created .log/.out files. It seems like a reasonable
> >> change to
> >> > > > do, but it could have repercussions on existing deployments since
> we
> >> > > > will naturally use more disk space (logs gotta go somewhere).
> >> > > >
> >> > > > I'm curious what people think about this.
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>


--

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Thomas Weise
The downside of change to the default behavior is the breakage of existing
k8s or other container-based production setups.

Logs that are emitted to stdout are visible through kubectl logs and
infrastructure is setup for log collection.

Thanks,
Thomas


On Tue, May 5, 2020 at 6:31 AM Arvid Heise <[hidden email]> wrote:

> From my previous experience with K8s, I'd assume that the cluster itself
> already has some ELK attached to it and all stdout/err is collected
> automatically.
>
> So if you want to also add logs, I'd make that configurable and I'm torn
> what the default should be as both make sense.
>
> However, thinking a bit further, it sounds as if the current way is already
> very inconsistent. We should probably either have log files for everything
> or log everything to stdout/err and let K8s deal with it. And then we
> should have some configuration to toggle between the modes.
>
> On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> wrote:
>
> > @Patrick could you chime in?
> >
> > We should at least understand the original motivation before simply
> > changing the way it works.
> >
> > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> wrote:
> >
> > > I think Patrick originally introduced the foreground mode, and I
> believe
> > > it had indeed something to do with container use and logging.
> > >
> > > IIRC the default assumption in docker and Kubernetes is that the logs
> > come
> > > on stdout (or stderr) so after "principle of least astonishment" the
> idea
> > > was to give a similar experience with Flink.
> > >
> > > On Tue, May 5, 2020 at 10:49 AM David Anderson <[hidden email]>
> > > wrote:
> > >
> > >> I like this idea because it should improve the experience (and reduce
> > >> confusion) for folks having their first flink experience via one of
> the
> > >> docker playgrounds. Right now it gives the impression that something
> is
> > >> broken out-of-the-box.
> > >>
> > >> Regards,
> > >> David
> > >>
> > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]>
> > >> wrote:
> > >>
> > >> > Hi everyone,
> > >> >
> > >> > thanks for starting this discussion Chesnay.
> > >> >
> > >> > I think it would be nice if we also displayed the logs when starting
> > the
> > >> > process in the foreground.
> > >> >
> > >> > The repercussions could be mitigated if the default logger
> > >> configurations
> > >> > would contain file rolling with a max log file size.
> > >> >
> > >> > @Yang I think there are solutions how to redirect stdout and stderr
> > into
> > >> > separate files using tee without duplication [1].
> > >> >
> > >> > [1] http://www.softpanorama.org/Tools/tee.shtml
> > >> >
> > >> > Cheers,
> > >> > Till
> > >> >
> > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]>
> > >> wrote:
> > >> >
> > >> > > Thanks for Chesnay starting this discussion.
> > >> > >
> > >> > > In FLINK-17166 implementation[1], we are trying to use "tee"
> instead
> > >> of
> > >> > > introducing the stream redirection(redirect the out/err to files).
> > >> > However,
> > >> > > a side effect is that the logging will be duplicated both in .log
> > and
> > >> > .out
> > >> > > files.
> > >> > > Then it may consume more disk space. However it is not a very
> > critical
> > >> > > problem since we could use log4j/logback configuration to control
> > the
> > >> > > rolling
> > >> > > files and max size.
> > >> > >
> > >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos
> > >> > deployment,
> > >> > > the behavior is just same as before.
> > >> > >
> > >> > >
> > >> > > [1]. https://github.com/apache/flink/pull/11839
> > >> > >
> > >> > > Best,
> > >> > > Yang
> > >> > >
> > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道:
> > >> > >
> > >> > > > Currently, processes started in the foreground (like in the case
> > of
> > >> > > > Docker) output all logging/stdout directly to the console,
> without
> > >> > > > creating any logging files.
> > >> > > >
> > >> > > > The downside of this approach, as outlined in FLIP-111, is that
> > the
> > >> > > > WebUI is not able to display the logs since it relies on these
> > very
> > >> > > > files to exist.
> > >> > > >
> > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this
> > such
> > >> > that
> > >> > > > we always created .log/.out files. It seems like a reasonable
> > >> change to
> > >> > > > do, but it could have repercussions on existing deployments
> since
> > we
> > >> > > > will naturally use more disk space (logs gotta go somewhere).
> > >> > > >
> > >> > > > I'm curious what people think about this.
> > >> > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>
>
> --
>
> Arvid Heise | Senior Java Developer
>
> <https://www.ververica.com/>
>
> Follow us @VervericaData
>
> --
>
> Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> Conference
>
> Stream Processing | Event Driven | Real Time
>
> --
>
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
>
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> (Toni) Cheng
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Yang Wang
Hi Thomas Weise,

I am not sure why this change will break the existing K8s/container-based
setup.
Since it will output the logs to stdout and log files at the same time, you
could still
use the `kubectl logs` to view the logs. And log collection could just work
as before.

What we could benefit from this is the logs could also be accessed via
Flink web
dashboard. I think it is more convenient when the users do not have the
permissions
to execute `kubectl`.


@Till Rohrmann <[hidden email]>, it will be great if we could use tee
to side output logs to file and avoid
duplication. I will have a look.


Best,
Yang

Thomas Weise <[hidden email]> 于2020年5月6日周三 上午1:40写道:

> The downside of change to the default behavior is the breakage of existing
> k8s or other container-based production setups.
>
> Logs that are emitted to stdout are visible through kubectl logs and
> infrastructure is setup for log collection.
>
> Thanks,
> Thomas
>
>
> On Tue, May 5, 2020 at 6:31 AM Arvid Heise <[hidden email]> wrote:
>
> > From my previous experience with K8s, I'd assume that the cluster itself
> > already has some ELK attached to it and all stdout/err is collected
> > automatically.
> >
> > So if you want to also add logs, I'd make that configurable and I'm torn
> > what the default should be as both make sense.
> >
> > However, thinking a bit further, it sounds as if the current way is
> already
> > very inconsistent. We should probably either have log files for
> everything
> > or log everything to stdout/err and let K8s deal with it. And then we
> > should have some configuration to toggle between the modes.
> >
> > On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> wrote:
> >
> > > @Patrick could you chime in?
> > >
> > > We should at least understand the original motivation before simply
> > > changing the way it works.
> > >
> > > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> wrote:
> > >
> > > > I think Patrick originally introduced the foreground mode, and I
> > believe
> > > > it had indeed something to do with container use and logging.
> > > >
> > > > IIRC the default assumption in docker and Kubernetes is that the logs
> > > come
> > > > on stdout (or stderr) so after "principle of least astonishment" the
> > idea
> > > > was to give a similar experience with Flink.
> > > >
> > > > On Tue, May 5, 2020 at 10:49 AM David Anderson <
> [hidden email]>
> > > > wrote:
> > > >
> > > >> I like this idea because it should improve the experience (and
> reduce
> > > >> confusion) for folks having their first flink experience via one of
> > the
> > > >> docker playgrounds. Right now it gives the impression that something
> > is
> > > >> broken out-of-the-box.
> > > >>
> > > >> Regards,
> > > >> David
> > > >>
> > > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]>
> > > >> wrote:
> > > >>
> > > >> > Hi everyone,
> > > >> >
> > > >> > thanks for starting this discussion Chesnay.
> > > >> >
> > > >> > I think it would be nice if we also displayed the logs when
> starting
> > > the
> > > >> > process in the foreground.
> > > >> >
> > > >> > The repercussions could be mitigated if the default logger
> > > >> configurations
> > > >> > would contain file rolling with a max log file size.
> > > >> >
> > > >> > @Yang I think there are solutions how to redirect stdout and
> stderr
> > > into
> > > >> > separate files using tee without duplication [1].
> > > >> >
> > > >> > [1] http://www.softpanorama.org/Tools/tee.shtml
> > > >> >
> > > >> > Cheers,
> > > >> > Till
> > > >> >
> > > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]>
> > > >> wrote:
> > > >> >
> > > >> > > Thanks for Chesnay starting this discussion.
> > > >> > >
> > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee"
> > instead
> > > >> of
> > > >> > > introducing the stream redirection(redirect the out/err to
> files).
> > > >> > However,
> > > >> > > a side effect is that the logging will be duplicated both in
> .log
> > > and
> > > >> > .out
> > > >> > > files.
> > > >> > > Then it may consume more disk space. However it is not a very
> > > critical
> > > >> > > problem since we could use log4j/logback configuration to
> control
> > > the
> > > >> > > rolling
> > > >> > > files and max size.
> > > >> > >
> > > >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos
> > > >> > deployment,
> > > >> > > the behavior is just same as before.
> > > >> > >
> > > >> > >
> > > >> > > [1]. https://github.com/apache/flink/pull/11839
> > > >> > >
> > > >> > > Best,
> > > >> > > Yang
> > > >> > >
> > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道:
> > > >> > >
> > > >> > > > Currently, processes started in the foreground (like in the
> case
> > > of
> > > >> > > > Docker) output all logging/stdout directly to the console,
> > without
> > > >> > > > creating any logging files.
> > > >> > > >
> > > >> > > > The downside of this approach, as outlined in FLIP-111, is
> that
> > > the
> > > >> > > > WebUI is not able to display the logs since it relies on these
> > > very
> > > >> > > > files to exist.
> > > >> > > >
> > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this
> > > such
> > > >> > that
> > > >> > > > we always created .log/.out files. It seems like a reasonable
> > > >> change to
> > > >> > > > do, but it could have repercussions on existing deployments
> > since
> > > we
> > > >> > > > will naturally use more disk space (logs gotta go somewhere).
> > > >> > > >
> > > >> > > > I'm curious what people think about this.
> > > >> > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > >
> >
> >
> > --
> >
> > Arvid Heise | Senior Java Developer
> >
> > <https://www.ververica.com/>
> >
> > Follow us @VervericaData
> >
> > --
> >
> > Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> > Conference
> >
> > Stream Processing | Event Driven | Real Time
> >
> > --
> >
> > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> >
> > --
> > Ververica GmbH
> > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> > (Toni) Cheng
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Till Rohrmann
Just for clarifications and as Yang already pointed out: The discussion
here is about also creating the log, out and err files as well as keeping
writing to STDOUT and STDERR.

Hence, there should be no regression for K8s users. The main problem, as
Chesnay pointed out, could be the increased disk usage by creating these
files.

Cheers,
Till

On Wed, May 6, 2020 at 5:10 AM Yang Wang <[hidden email]> wrote:

> Hi Thomas Weise,
>
> I am not sure why this change will break the existing K8s/container-based
> setup.
> Since it will output the logs to stdout and log files at the same time, you
> could still
> use the `kubectl logs` to view the logs. And log collection could just work
> as before.
>
> What we could benefit from this is the logs could also be accessed via
> Flink web
> dashboard. I think it is more convenient when the users do not have the
> permissions
> to execute `kubectl`.
>
>
> @Till Rohrmann <[hidden email]>, it will be great if we could use
> tee
> to side output logs to file and avoid
> duplication. I will have a look.
>
>
> Best,
> Yang
>
> Thomas Weise <[hidden email]> 于2020年5月6日周三 上午1:40写道:
>
> > The downside of change to the default behavior is the breakage of
> existing
> > k8s or other container-based production setups.
> >
> > Logs that are emitted to stdout are visible through kubectl logs and
> > infrastructure is setup for log collection.
> >
> > Thanks,
> > Thomas
> >
> >
> > On Tue, May 5, 2020 at 6:31 AM Arvid Heise <[hidden email]> wrote:
> >
> > > From my previous experience with K8s, I'd assume that the cluster
> itself
> > > already has some ELK attached to it and all stdout/err is collected
> > > automatically.
> > >
> > > So if you want to also add logs, I'd make that configurable and I'm
> torn
> > > what the default should be as both make sense.
> > >
> > > However, thinking a bit further, it sounds as if the current way is
> > already
> > > very inconsistent. We should probably either have log files for
> > everything
> > > or log everything to stdout/err and let K8s deal with it. And then we
> > > should have some configuration to toggle between the modes.
> > >
> > > On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> wrote:
> > >
> > > > @Patrick could you chime in?
> > > >
> > > > We should at least understand the original motivation before simply
> > > > changing the way it works.
> > > >
> > > > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]>
> wrote:
> > > >
> > > > > I think Patrick originally introduced the foreground mode, and I
> > > believe
> > > > > it had indeed something to do with container use and logging.
> > > > >
> > > > > IIRC the default assumption in docker and Kubernetes is that the
> logs
> > > > come
> > > > > on stdout (or stderr) so after "principle of least astonishment"
> the
> > > idea
> > > > > was to give a similar experience with Flink.
> > > > >
> > > > > On Tue, May 5, 2020 at 10:49 AM David Anderson <
> > [hidden email]>
> > > > > wrote:
> > > > >
> > > > >> I like this idea because it should improve the experience (and
> > reduce
> > > > >> confusion) for folks having their first flink experience via one
> of
> > > the
> > > > >> docker playgrounds. Right now it gives the impression that
> something
> > > is
> > > > >> broken out-of-the-box.
> > > > >>
> > > > >> Regards,
> > > > >> David
> > > > >>
> > > > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <
> [hidden email]>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi everyone,
> > > > >> >
> > > > >> > thanks for starting this discussion Chesnay.
> > > > >> >
> > > > >> > I think it would be nice if we also displayed the logs when
> > starting
> > > > the
> > > > >> > process in the foreground.
> > > > >> >
> > > > >> > The repercussions could be mitigated if the default logger
> > > > >> configurations
> > > > >> > would contain file rolling with a max log file size.
> > > > >> >
> > > > >> > @Yang I think there are solutions how to redirect stdout and
> > stderr
> > > > into
> > > > >> > separate files using tee without duplication [1].
> > > > >> >
> > > > >> > [1] http://www.softpanorama.org/Tools/tee.shtml
> > > > >> >
> > > > >> > Cheers,
> > > > >> > Till
> > > > >> >
> > > > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <
> [hidden email]>
> > > > >> wrote:
> > > > >> >
> > > > >> > > Thanks for Chesnay starting this discussion.
> > > > >> > >
> > > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee"
> > > instead
> > > > >> of
> > > > >> > > introducing the stream redirection(redirect the out/err to
> > files).
> > > > >> > However,
> > > > >> > > a side effect is that the logging will be duplicated both in
> > .log
> > > > and
> > > > >> > .out
> > > > >> > > files.
> > > > >> > > Then it may consume more disk space. However it is not a very
> > > > critical
> > > > >> > > problem since we could use log4j/logback configuration to
> > control
> > > > the
> > > > >> > > rolling
> > > > >> > > files and max size.
> > > > >> > >
> > > > >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos
> > > > >> > deployment,
> > > > >> > > the behavior is just same as before.
> > > > >> > >
> > > > >> > >
> > > > >> > > [1]. https://github.com/apache/flink/pull/11839
> > > > >> > >
> > > > >> > > Best,
> > > > >> > > Yang
> > > > >> > >
> > > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三
> 上午12:30写道:
> > > > >> > >
> > > > >> > > > Currently, processes started in the foreground (like in the
> > case
> > > > of
> > > > >> > > > Docker) output all logging/stdout directly to the console,
> > > without
> > > > >> > > > creating any logging files.
> > > > >> > > >
> > > > >> > > > The downside of this approach, as outlined in FLIP-111, is
> > that
> > > > the
> > > > >> > > > WebUI is not able to display the logs since it relies on
> these
> > > > very
> > > > >> > > > files to exist.
> > > > >> > > >
> > > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change
> this
> > > > such
> > > > >> > that
> > > > >> > > > we always created .log/.out files. It seems like a
> reasonable
> > > > >> change to
> > > > >> > > > do, but it could have repercussions on existing deployments
> > > since
> > > > we
> > > > >> > > > will naturally use more disk space (logs gotta go
> somewhere).
> > > > >> > > >
> > > > >> > > > I'm curious what people think about this.
> > > > >> > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Arvid Heise | Senior Java Developer
> > >
> > > <https://www.ververica.com/>
> > >
> > > Follow us @VervericaData
> > >
> > > --
> > >
> > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> > > Conference
> > >
> > > Stream Processing | Event Driven | Real Time
> > >
> > > --
> > >
> > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > >
> > > --
> > > Ververica GmbH
> > > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
> > > (Toni) Cheng
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Have foreground processes also create log files

Stephan Ewen
Thanks for clarifying, that was not clear to me.

That sounds fine to me, given that it just adds extra information, not
changes existing one.

On Wed, May 6, 2020 at 9:06 AM Till Rohrmann <[hidden email]> wrote:

> Just for clarifications and as Yang already pointed out: The discussion
> here is about also creating the log, out and err files as well as keeping
> writing to STDOUT and STDERR.
>
> Hence, there should be no regression for K8s users. The main problem, as
> Chesnay pointed out, could be the increased disk usage by creating these
> files.
>
> Cheers,
> Till
>
> On Wed, May 6, 2020 at 5:10 AM Yang Wang <[hidden email]> wrote:
>
> > Hi Thomas Weise,
> >
> > I am not sure why this change will break the existing K8s/container-based
> > setup.
> > Since it will output the logs to stdout and log files at the same time,
> you
> > could still
> > use the `kubectl logs` to view the logs. And log collection could just
> work
> > as before.
> >
> > What we could benefit from this is the logs could also be accessed via
> > Flink web
> > dashboard. I think it is more convenient when the users do not have the
> > permissions
> > to execute `kubectl`.
> >
> >
> > @Till Rohrmann <[hidden email]>, it will be great if we could use
> > tee
> > to side output logs to file and avoid
> > duplication. I will have a look.
> >
> >
> > Best,
> > Yang
> >
> > Thomas Weise <[hidden email]> 于2020年5月6日周三 上午1:40写道:
> >
> > > The downside of change to the default behavior is the breakage of
> > existing
> > > k8s or other container-based production setups.
> > >
> > > Logs that are emitted to stdout are visible through kubectl logs and
> > > infrastructure is setup for log collection.
> > >
> > > Thanks,
> > > Thomas
> > >
> > >
> > > On Tue, May 5, 2020 at 6:31 AM Arvid Heise <[hidden email]>
> wrote:
> > >
> > > > From my previous experience with K8s, I'd assume that the cluster
> > itself
> > > > already has some ELK attached to it and all stdout/err is collected
> > > > automatically.
> > > >
> > > > So if you want to also add logs, I'd make that configurable and I'm
> > torn
> > > > what the default should be as both make sense.
> > > >
> > > > However, thinking a bit further, it sounds as if the current way is
> > > already
> > > > very inconsistent. We should probably either have log files for
> > > everything
> > > > or log everything to stdout/err and let K8s deal with it. And then we
> > > > should have some configuration to toggle between the modes.
> > > >
> > > > On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]>
> wrote:
> > > >
> > > > > @Patrick could you chime in?
> > > > >
> > > > > We should at least understand the original motivation before simply
> > > > > changing the way it works.
> > > > >
> > > > > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]>
> > wrote:
> > > > >
> > > > > > I think Patrick originally introduced the foreground mode, and I
> > > > believe
> > > > > > it had indeed something to do with container use and logging.
> > > > > >
> > > > > > IIRC the default assumption in docker and Kubernetes is that the
> > logs
> > > > > come
> > > > > > on stdout (or stderr) so after "principle of least astonishment"
> > the
> > > > idea
> > > > > > was to give a similar experience with Flink.
> > > > > >
> > > > > > On Tue, May 5, 2020 at 10:49 AM David Anderson <
> > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > >> I like this idea because it should improve the experience (and
> > > reduce
> > > > > >> confusion) for folks having their first flink experience via one
> > of
> > > > the
> > > > > >> docker playgrounds. Right now it gives the impression that
> > something
> > > > is
> > > > > >> broken out-of-the-box.
> > > > > >>
> > > > > >> Regards,
> > > > > >> David
> > > > > >>
> > > > > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <
> > [hidden email]>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > Hi everyone,
> > > > > >> >
> > > > > >> > thanks for starting this discussion Chesnay.
> > > > > >> >
> > > > > >> > I think it would be nice if we also displayed the logs when
> > > starting
> > > > > the
> > > > > >> > process in the foreground.
> > > > > >> >
> > > > > >> > The repercussions could be mitigated if the default logger
> > > > > >> configurations
> > > > > >> > would contain file rolling with a max log file size.
> > > > > >> >
> > > > > >> > @Yang I think there are solutions how to redirect stdout and
> > > stderr
> > > > > into
> > > > > >> > separate files using tee without duplication [1].
> > > > > >> >
> > > > > >> > [1] http://www.softpanorama.org/Tools/tee.shtml
> > > > > >> >
> > > > > >> > Cheers,
> > > > > >> > Till
> > > > > >> >
> > > > > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <
> > [hidden email]>
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > > Thanks for Chesnay starting this discussion.
> > > > > >> > >
> > > > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee"
> > > > instead
> > > > > >> of
> > > > > >> > > introducing the stream redirection(redirect the out/err to
> > > files).
> > > > > >> > However,
> > > > > >> > > a side effect is that the logging will be duplicated both in
> > > .log
> > > > > and
> > > > > >> > .out
> > > > > >> > > files.
> > > > > >> > > Then it may consume more disk space. However it is not a
> very
> > > > > critical
> > > > > >> > > problem since we could use log4j/logback configuration to
> > > control
> > > > > the
> > > > > >> > > rolling
> > > > > >> > > files and max size.
> > > > > >> > >
> > > > > >> > > Also, it only happens in docker/K8s deployment. For
> YARN/Mesos
> > > > > >> > deployment,
> > > > > >> > > the behavior is just same as before.
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > [1]. https://github.com/apache/flink/pull/11839
> > > > > >> > >
> > > > > >> > > Best,
> > > > > >> > > Yang
> > > > > >> > >
> > > > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三
> > 上午12:30写道:
> > > > > >> > >
> > > > > >> > > > Currently, processes started in the foreground (like in
> the
> > > case
> > > > > of
> > > > > >> > > > Docker) output all logging/stdout directly to the console,
> > > > without
> > > > > >> > > > creating any logging files.
> > > > > >> > > >
> > > > > >> > > > The downside of this approach, as outlined in FLIP-111, is
> > > that
> > > > > the
> > > > > >> > > > WebUI is not able to display the logs since it relies on
> > these
> > > > > very
> > > > > >> > > > files to exist.
> > > > > >> > > >
> > > > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change
> > this
> > > > > such
> > > > > >> > that
> > > > > >> > > > we always created .log/.out files. It seems like a
> > reasonable
> > > > > >> change to
> > > > > >> > > > do, but it could have repercussions on existing
> deployments
> > > > since
> > > > > we
> > > > > >> > > > will naturally use more disk space (logs gotta go
> > somewhere).
> > > > > >> > > >
> > > > > >> > > > I'm curious what people think about this.
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Arvid Heise | Senior Java Developer
> > > >
> > > > <https://www.ververica.com/>
> > > >
> > > > Follow us @VervericaData
> > > >
> > > > --
> > > >
> > > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink
> > > > Conference
> > > >
> > > > Stream Processing | Event Driven | Real Time
> > > >
> > > > --
> > > >
> > > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > > >
> > > > --
> > > > Ververica GmbH
> > > > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > > > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason,
> Ji
> > > > (Toni) Cheng
> > > >
> > >
> >
>