Currently, processes started in the foreground (like in the case of
Docker) output all logging/stdout directly to the console, without creating any logging files. The downside of this approach, as outlined in FLIP-111, is that the WebUI is not able to display the logs since it relies on these very files to exist. In FLINK-17166 (part of FLIP-111) we are trying to change this such that we always created .log/.out files. It seems like a reasonable change to do, but it could have repercussions on existing deployments since we will naturally use more disk space (logs gotta go somewhere). I'm curious what people think about this. |
Thanks for Chesnay starting this discussion.
In FLINK-17166 implementation[1], we are trying to use "tee" instead of introducing the stream redirection(redirect the out/err to files). However, a side effect is that the logging will be duplicated both in .log and .out files. Then it may consume more disk space. However it is not a very critical problem since we could use log4j/logback configuration to control the rolling files and max size. Also, it only happens in docker/K8s deployment. For YARN/Mesos deployment, the behavior is just same as before. [1]. https://github.com/apache/flink/pull/11839 Best, Yang Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道: > Currently, processes started in the foreground (like in the case of > Docker) output all logging/stdout directly to the console, without > creating any logging files. > > The downside of this approach, as outlined in FLIP-111, is that the > WebUI is not able to display the logs since it relies on these very > files to exist. > > In FLINK-17166 (part of FLIP-111) we are trying to change this such that > we always created .log/.out files. It seems like a reasonable change to > do, but it could have repercussions on existing deployments since we > will naturally use more disk space (logs gotta go somewhere). > > I'm curious what people think about this. > > |
Hi everyone,
thanks for starting this discussion Chesnay. I think it would be nice if we also displayed the logs when starting the process in the foreground. The repercussions could be mitigated if the default logger configurations would contain file rolling with a max log file size. @Yang I think there are solutions how to redirect stdout and stderr into separate files using tee without duplication [1]. [1] http://www.softpanorama.org/Tools/tee.shtml Cheers, Till On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> wrote: > Thanks for Chesnay starting this discussion. > > In FLINK-17166 implementation[1], we are trying to use "tee" instead of > introducing the stream redirection(redirect the out/err to files). However, > a side effect is that the logging will be duplicated both in .log and .out > files. > Then it may consume more disk space. However it is not a very critical > problem since we could use log4j/logback configuration to control the > rolling > files and max size. > > Also, it only happens in docker/K8s deployment. For YARN/Mesos deployment, > the behavior is just same as before. > > > [1]. https://github.com/apache/flink/pull/11839 > > Best, > Yang > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道: > > > Currently, processes started in the foreground (like in the case of > > Docker) output all logging/stdout directly to the console, without > > creating any logging files. > > > > The downside of this approach, as outlined in FLIP-111, is that the > > WebUI is not able to display the logs since it relies on these very > > files to exist. > > > > In FLINK-17166 (part of FLIP-111) we are trying to change this such that > > we always created .log/.out files. It seems like a reasonable change to > > do, but it could have repercussions on existing deployments since we > > will naturally use more disk space (logs gotta go somewhere). > > > > I'm curious what people think about this. > > > > > |
I like this idea because it should improve the experience (and reduce
confusion) for folks having their first flink experience via one of the docker playgrounds. Right now it gives the impression that something is broken out-of-the-box. Regards, David On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]> wrote: > Hi everyone, > > thanks for starting this discussion Chesnay. > > I think it would be nice if we also displayed the logs when starting the > process in the foreground. > > The repercussions could be mitigated if the default logger configurations > would contain file rolling with a max log file size. > > @Yang I think there are solutions how to redirect stdout and stderr into > separate files using tee without duplication [1]. > > [1] http://www.softpanorama.org/Tools/tee.shtml > > Cheers, > Till > > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> wrote: > > > Thanks for Chesnay starting this discussion. > > > > In FLINK-17166 implementation[1], we are trying to use "tee" instead of > > introducing the stream redirection(redirect the out/err to files). > However, > > a side effect is that the logging will be duplicated both in .log and > .out > > files. > > Then it may consume more disk space. However it is not a very critical > > problem since we could use log4j/logback configuration to control the > > rolling > > files and max size. > > > > Also, it only happens in docker/K8s deployment. For YARN/Mesos > deployment, > > the behavior is just same as before. > > > > > > [1]. https://github.com/apache/flink/pull/11839 > > > > Best, > > Yang > > > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道: > > > > > Currently, processes started in the foreground (like in the case of > > > Docker) output all logging/stdout directly to the console, without > > > creating any logging files. > > > > > > The downside of this approach, as outlined in FLIP-111, is that the > > > WebUI is not able to display the logs since it relies on these very > > > files to exist. > > > > > > In FLINK-17166 (part of FLIP-111) we are trying to change this such > that > > > we always created .log/.out files. It seems like a reasonable change to > > > do, but it could have repercussions on existing deployments since we > > > will naturally use more disk space (logs gotta go somewhere). > > > > > > I'm curious what people think about this. > > > > > > > > > |
I think Patrick originally introduced the foreground mode, and I believe it
had indeed something to do with container use and logging. IIRC the default assumption in docker and Kubernetes is that the logs come on stdout (or stderr) so after "principle of least astonishment" the idea was to give a similar experience with Flink. On Tue, May 5, 2020 at 10:49 AM David Anderson <[hidden email]> wrote: > I like this idea because it should improve the experience (and reduce > confusion) for folks having their first flink experience via one of the > docker playgrounds. Right now it gives the impression that something is > broken out-of-the-box. > > Regards, > David > > On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]> wrote: > > > Hi everyone, > > > > thanks for starting this discussion Chesnay. > > > > I think it would be nice if we also displayed the logs when starting the > > process in the foreground. > > > > The repercussions could be mitigated if the default logger configurations > > would contain file rolling with a max log file size. > > > > @Yang I think there are solutions how to redirect stdout and stderr into > > separate files using tee without duplication [1]. > > > > [1] http://www.softpanorama.org/Tools/tee.shtml > > > > Cheers, > > Till > > > > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> wrote: > > > > > Thanks for Chesnay starting this discussion. > > > > > > In FLINK-17166 implementation[1], we are trying to use "tee" instead of > > > introducing the stream redirection(redirect the out/err to files). > > However, > > > a side effect is that the logging will be duplicated both in .log and > > .out > > > files. > > > Then it may consume more disk space. However it is not a very critical > > > problem since we could use log4j/logback configuration to control the > > > rolling > > > files and max size. > > > > > > Also, it only happens in docker/K8s deployment. For YARN/Mesos > > deployment, > > > the behavior is just same as before. > > > > > > > > > [1]. https://github.com/apache/flink/pull/11839 > > > > > > Best, > > > Yang > > > > > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道: > > > > > > > Currently, processes started in the foreground (like in the case of > > > > Docker) output all logging/stdout directly to the console, without > > > > creating any logging files. > > > > > > > > The downside of this approach, as outlined in FLIP-111, is that the > > > > WebUI is not able to display the logs since it relies on these very > > > > files to exist. > > > > > > > > In FLINK-17166 (part of FLIP-111) we are trying to change this such > > that > > > > we always created .log/.out files. It seems like a reasonable change > to > > > > do, but it could have repercussions on existing deployments since we > > > > will naturally use more disk space (logs gotta go somewhere). > > > > > > > > I'm curious what people think about this. > > > > > > > > > > > > > > |
@Patrick could you chime in?
We should at least understand the original motivation before simply changing the way it works. On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> wrote: > I think Patrick originally introduced the foreground mode, and I believe > it had indeed something to do with container use and logging. > > IIRC the default assumption in docker and Kubernetes is that the logs come > on stdout (or stderr) so after "principle of least astonishment" the idea > was to give a similar experience with Flink. > > On Tue, May 5, 2020 at 10:49 AM David Anderson <[hidden email]> > wrote: > >> I like this idea because it should improve the experience (and reduce >> confusion) for folks having their first flink experience via one of the >> docker playgrounds. Right now it gives the impression that something is >> broken out-of-the-box. >> >> Regards, >> David >> >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]> >> wrote: >> >> > Hi everyone, >> > >> > thanks for starting this discussion Chesnay. >> > >> > I think it would be nice if we also displayed the logs when starting the >> > process in the foreground. >> > >> > The repercussions could be mitigated if the default logger >> configurations >> > would contain file rolling with a max log file size. >> > >> > @Yang I think there are solutions how to redirect stdout and stderr into >> > separate files using tee without duplication [1]. >> > >> > [1] http://www.softpanorama.org/Tools/tee.shtml >> > >> > Cheers, >> > Till >> > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> >> wrote: >> > >> > > Thanks for Chesnay starting this discussion. >> > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee" instead >> of >> > > introducing the stream redirection(redirect the out/err to files). >> > However, >> > > a side effect is that the logging will be duplicated both in .log and >> > .out >> > > files. >> > > Then it may consume more disk space. However it is not a very critical >> > > problem since we could use log4j/logback configuration to control the >> > > rolling >> > > files and max size. >> > > >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos >> > deployment, >> > > the behavior is just same as before. >> > > >> > > >> > > [1]. https://github.com/apache/flink/pull/11839 >> > > >> > > Best, >> > > Yang >> > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道: >> > > >> > > > Currently, processes started in the foreground (like in the case of >> > > > Docker) output all logging/stdout directly to the console, without >> > > > creating any logging files. >> > > > >> > > > The downside of this approach, as outlined in FLIP-111, is that the >> > > > WebUI is not able to display the logs since it relies on these very >> > > > files to exist. >> > > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this such >> > that >> > > > we always created .log/.out files. It seems like a reasonable >> change to >> > > > do, but it could have repercussions on existing deployments since we >> > > > will naturally use more disk space (logs gotta go somewhere). >> > > > >> > > > I'm curious what people think about this. >> > > > >> > > > >> > > >> > >> > |
From my previous experience with K8s, I'd assume that the cluster itself
already has some ELK attached to it and all stdout/err is collected automatically. So if you want to also add logs, I'd make that configurable and I'm torn what the default should be as both make sense. However, thinking a bit further, it sounds as if the current way is already very inconsistent. We should probably either have log files for everything or log everything to stdout/err and let K8s deal with it. And then we should have some configuration to toggle between the modes. On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> wrote: > @Patrick could you chime in? > > We should at least understand the original motivation before simply > changing the way it works. > > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> wrote: > > > I think Patrick originally introduced the foreground mode, and I believe > > it had indeed something to do with container use and logging. > > > > IIRC the default assumption in docker and Kubernetes is that the logs > come > > on stdout (or stderr) so after "principle of least astonishment" the idea > > was to give a similar experience with Flink. > > > > On Tue, May 5, 2020 at 10:49 AM David Anderson <[hidden email]> > > wrote: > > > >> I like this idea because it should improve the experience (and reduce > >> confusion) for folks having their first flink experience via one of the > >> docker playgrounds. Right now it gives the impression that something is > >> broken out-of-the-box. > >> > >> Regards, > >> David > >> > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]> > >> wrote: > >> > >> > Hi everyone, > >> > > >> > thanks for starting this discussion Chesnay. > >> > > >> > I think it would be nice if we also displayed the logs when starting > the > >> > process in the foreground. > >> > > >> > The repercussions could be mitigated if the default logger > >> configurations > >> > would contain file rolling with a max log file size. > >> > > >> > @Yang I think there are solutions how to redirect stdout and stderr > into > >> > separate files using tee without duplication [1]. > >> > > >> > [1] http://www.softpanorama.org/Tools/tee.shtml > >> > > >> > Cheers, > >> > Till > >> > > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> > >> wrote: > >> > > >> > > Thanks for Chesnay starting this discussion. > >> > > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee" instead > >> of > >> > > introducing the stream redirection(redirect the out/err to files). > >> > However, > >> > > a side effect is that the logging will be duplicated both in .log > and > >> > .out > >> > > files. > >> > > Then it may consume more disk space. However it is not a very > critical > >> > > problem since we could use log4j/logback configuration to control > the > >> > > rolling > >> > > files and max size. > >> > > > >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos > >> > deployment, > >> > > the behavior is just same as before. > >> > > > >> > > > >> > > [1]. https://github.com/apache/flink/pull/11839 > >> > > > >> > > Best, > >> > > Yang > >> > > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道: > >> > > > >> > > > Currently, processes started in the foreground (like in the case > of > >> > > > Docker) output all logging/stdout directly to the console, without > >> > > > creating any logging files. > >> > > > > >> > > > The downside of this approach, as outlined in FLIP-111, is that > the > >> > > > WebUI is not able to display the logs since it relies on these > very > >> > > > files to exist. > >> > > > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this > such > >> > that > >> > > > we always created .log/.out files. It seems like a reasonable > >> change to > >> > > > do, but it could have repercussions on existing deployments since > we > >> > > > will naturally use more disk space (logs gotta go somewhere). > >> > > > > >> > > > I'm curious what people think about this. > >> > > > > >> > > > > >> > > > >> > > >> > > > -- Arvid Heise | Senior Java Developer <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji (Toni) Cheng |
The downside of change to the default behavior is the breakage of existing
k8s or other container-based production setups. Logs that are emitted to stdout are visible through kubectl logs and infrastructure is setup for log collection. Thanks, Thomas On Tue, May 5, 2020 at 6:31 AM Arvid Heise <[hidden email]> wrote: > From my previous experience with K8s, I'd assume that the cluster itself > already has some ELK attached to it and all stdout/err is collected > automatically. > > So if you want to also add logs, I'd make that configurable and I'm torn > what the default should be as both make sense. > > However, thinking a bit further, it sounds as if the current way is already > very inconsistent. We should probably either have log files for everything > or log everything to stdout/err and let K8s deal with it. And then we > should have some configuration to toggle between the modes. > > On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> wrote: > > > @Patrick could you chime in? > > > > We should at least understand the original motivation before simply > > changing the way it works. > > > > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> wrote: > > > > > I think Patrick originally introduced the foreground mode, and I > believe > > > it had indeed something to do with container use and logging. > > > > > > IIRC the default assumption in docker and Kubernetes is that the logs > > come > > > on stdout (or stderr) so after "principle of least astonishment" the > idea > > > was to give a similar experience with Flink. > > > > > > On Tue, May 5, 2020 at 10:49 AM David Anderson <[hidden email]> > > > wrote: > > > > > >> I like this idea because it should improve the experience (and reduce > > >> confusion) for folks having their first flink experience via one of > the > > >> docker playgrounds. Right now it gives the impression that something > is > > >> broken out-of-the-box. > > >> > > >> Regards, > > >> David > > >> > > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]> > > >> wrote: > > >> > > >> > Hi everyone, > > >> > > > >> > thanks for starting this discussion Chesnay. > > >> > > > >> > I think it would be nice if we also displayed the logs when starting > > the > > >> > process in the foreground. > > >> > > > >> > The repercussions could be mitigated if the default logger > > >> configurations > > >> > would contain file rolling with a max log file size. > > >> > > > >> > @Yang I think there are solutions how to redirect stdout and stderr > > into > > >> > separate files using tee without duplication [1]. > > >> > > > >> > [1] http://www.softpanorama.org/Tools/tee.shtml > > >> > > > >> > Cheers, > > >> > Till > > >> > > > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> > > >> wrote: > > >> > > > >> > > Thanks for Chesnay starting this discussion. > > >> > > > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee" > instead > > >> of > > >> > > introducing the stream redirection(redirect the out/err to files). > > >> > However, > > >> > > a side effect is that the logging will be duplicated both in .log > > and > > >> > .out > > >> > > files. > > >> > > Then it may consume more disk space. However it is not a very > > critical > > >> > > problem since we could use log4j/logback configuration to control > > the > > >> > > rolling > > >> > > files and max size. > > >> > > > > >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos > > >> > deployment, > > >> > > the behavior is just same as before. > > >> > > > > >> > > > > >> > > [1]. https://github.com/apache/flink/pull/11839 > > >> > > > > >> > > Best, > > >> > > Yang > > >> > > > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道: > > >> > > > > >> > > > Currently, processes started in the foreground (like in the case > > of > > >> > > > Docker) output all logging/stdout directly to the console, > without > > >> > > > creating any logging files. > > >> > > > > > >> > > > The downside of this approach, as outlined in FLIP-111, is that > > the > > >> > > > WebUI is not able to display the logs since it relies on these > > very > > >> > > > files to exist. > > >> > > > > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this > > such > > >> > that > > >> > > > we always created .log/.out files. It seems like a reasonable > > >> change to > > >> > > > do, but it could have repercussions on existing deployments > since > > we > > >> > > > will naturally use more disk space (logs gotta go somewhere). > > >> > > > > > >> > > > I'm curious what people think about this. > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > > > > -- > > Arvid Heise | Senior Java Developer > > <https://www.ververica.com/> > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > -- > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > Ververica GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > (Toni) Cheng > |
Hi Thomas Weise,
I am not sure why this change will break the existing K8s/container-based setup. Since it will output the logs to stdout and log files at the same time, you could still use the `kubectl logs` to view the logs. And log collection could just work as before. What we could benefit from this is the logs could also be accessed via Flink web dashboard. I think it is more convenient when the users do not have the permissions to execute `kubectl`. @Till Rohrmann <[hidden email]>, it will be great if we could use tee to side output logs to file and avoid duplication. I will have a look. Best, Yang Thomas Weise <[hidden email]> 于2020年5月6日周三 上午1:40写道: > The downside of change to the default behavior is the breakage of existing > k8s or other container-based production setups. > > Logs that are emitted to stdout are visible through kubectl logs and > infrastructure is setup for log collection. > > Thanks, > Thomas > > > On Tue, May 5, 2020 at 6:31 AM Arvid Heise <[hidden email]> wrote: > > > From my previous experience with K8s, I'd assume that the cluster itself > > already has some ELK attached to it and all stdout/err is collected > > automatically. > > > > So if you want to also add logs, I'd make that configurable and I'm torn > > what the default should be as both make sense. > > > > However, thinking a bit further, it sounds as if the current way is > already > > very inconsistent. We should probably either have log files for > everything > > or log everything to stdout/err and let K8s deal with it. And then we > > should have some configuration to toggle between the modes. > > > > On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> wrote: > > > > > @Patrick could you chime in? > > > > > > We should at least understand the original motivation before simply > > > changing the way it works. > > > > > > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> wrote: > > > > > > > I think Patrick originally introduced the foreground mode, and I > > believe > > > > it had indeed something to do with container use and logging. > > > > > > > > IIRC the default assumption in docker and Kubernetes is that the logs > > > come > > > > on stdout (or stderr) so after "principle of least astonishment" the > > idea > > > > was to give a similar experience with Flink. > > > > > > > > On Tue, May 5, 2020 at 10:49 AM David Anderson < > [hidden email]> > > > > wrote: > > > > > > > >> I like this idea because it should improve the experience (and > reduce > > > >> confusion) for folks having their first flink experience via one of > > the > > > >> docker playgrounds. Right now it gives the impression that something > > is > > > >> broken out-of-the-box. > > > >> > > > >> Regards, > > > >> David > > > >> > > > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann <[hidden email]> > > > >> wrote: > > > >> > > > >> > Hi everyone, > > > >> > > > > >> > thanks for starting this discussion Chesnay. > > > >> > > > > >> > I think it would be nice if we also displayed the logs when > starting > > > the > > > >> > process in the foreground. > > > >> > > > > >> > The repercussions could be mitigated if the default logger > > > >> configurations > > > >> > would contain file rolling with a max log file size. > > > >> > > > > >> > @Yang I think there are solutions how to redirect stdout and > stderr > > > into > > > >> > separate files using tee without duplication [1]. > > > >> > > > > >> > [1] http://www.softpanorama.org/Tools/tee.shtml > > > >> > > > > >> > Cheers, > > > >> > Till > > > >> > > > > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang <[hidden email]> > > > >> wrote: > > > >> > > > > >> > > Thanks for Chesnay starting this discussion. > > > >> > > > > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee" > > instead > > > >> of > > > >> > > introducing the stream redirection(redirect the out/err to > files). > > > >> > However, > > > >> > > a side effect is that the logging will be duplicated both in > .log > > > and > > > >> > .out > > > >> > > files. > > > >> > > Then it may consume more disk space. However it is not a very > > > critical > > > >> > > problem since we could use log4j/logback configuration to > control > > > the > > > >> > > rolling > > > >> > > files and max size. > > > >> > > > > > >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos > > > >> > deployment, > > > >> > > the behavior is just same as before. > > > >> > > > > > >> > > > > > >> > > [1]. https://github.com/apache/flink/pull/11839 > > > >> > > > > > >> > > Best, > > > >> > > Yang > > > >> > > > > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 上午12:30写道: > > > >> > > > > > >> > > > Currently, processes started in the foreground (like in the > case > > > of > > > >> > > > Docker) output all logging/stdout directly to the console, > > without > > > >> > > > creating any logging files. > > > >> > > > > > > >> > > > The downside of this approach, as outlined in FLIP-111, is > that > > > the > > > >> > > > WebUI is not able to display the logs since it relies on these > > > very > > > >> > > > files to exist. > > > >> > > > > > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change this > > > such > > > >> > that > > > >> > > > we always created .log/.out files. It seems like a reasonable > > > >> change to > > > >> > > > do, but it could have repercussions on existing deployments > > since > > > we > > > >> > > > will naturally use more disk space (logs gotta go somewhere). > > > >> > > > > > > >> > > > I'm curious what people think about this. > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > > > > -- > > > > Arvid Heise | Senior Java Developer > > > > <https://www.ververica.com/> > > > > Follow us @VervericaData > > > > -- > > > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > > Conference > > > > Stream Processing | Event Driven | Real Time > > > > -- > > > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > > > -- > > Ververica GmbH > > Registered at Amtsgericht Charlottenburg: HRB 158244 B > > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > > (Toni) Cheng > > > |
Just for clarifications and as Yang already pointed out: The discussion
here is about also creating the log, out and err files as well as keeping writing to STDOUT and STDERR. Hence, there should be no regression for K8s users. The main problem, as Chesnay pointed out, could be the increased disk usage by creating these files. Cheers, Till On Wed, May 6, 2020 at 5:10 AM Yang Wang <[hidden email]> wrote: > Hi Thomas Weise, > > I am not sure why this change will break the existing K8s/container-based > setup. > Since it will output the logs to stdout and log files at the same time, you > could still > use the `kubectl logs` to view the logs. And log collection could just work > as before. > > What we could benefit from this is the logs could also be accessed via > Flink web > dashboard. I think it is more convenient when the users do not have the > permissions > to execute `kubectl`. > > > @Till Rohrmann <[hidden email]>, it will be great if we could use > tee > to side output logs to file and avoid > duplication. I will have a look. > > > Best, > Yang > > Thomas Weise <[hidden email]> 于2020年5月6日周三 上午1:40写道: > > > The downside of change to the default behavior is the breakage of > existing > > k8s or other container-based production setups. > > > > Logs that are emitted to stdout are visible through kubectl logs and > > infrastructure is setup for log collection. > > > > Thanks, > > Thomas > > > > > > On Tue, May 5, 2020 at 6:31 AM Arvid Heise <[hidden email]> wrote: > > > > > From my previous experience with K8s, I'd assume that the cluster > itself > > > already has some ELK attached to it and all stdout/err is collected > > > automatically. > > > > > > So if you want to also add logs, I'd make that configurable and I'm > torn > > > what the default should be as both make sense. > > > > > > However, thinking a bit further, it sounds as if the current way is > > already > > > very inconsistent. We should probably either have log files for > > everything > > > or log everything to stdout/err and let K8s deal with it. And then we > > > should have some configuration to toggle between the modes. > > > > > > On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> wrote: > > > > > > > @Patrick could you chime in? > > > > > > > > We should at least understand the original motivation before simply > > > > changing the way it works. > > > > > > > > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> > wrote: > > > > > > > > > I think Patrick originally introduced the foreground mode, and I > > > believe > > > > > it had indeed something to do with container use and logging. > > > > > > > > > > IIRC the default assumption in docker and Kubernetes is that the > logs > > > > come > > > > > on stdout (or stderr) so after "principle of least astonishment" > the > > > idea > > > > > was to give a similar experience with Flink. > > > > > > > > > > On Tue, May 5, 2020 at 10:49 AM David Anderson < > > [hidden email]> > > > > > wrote: > > > > > > > > > >> I like this idea because it should improve the experience (and > > reduce > > > > >> confusion) for folks having their first flink experience via one > of > > > the > > > > >> docker playgrounds. Right now it gives the impression that > something > > > is > > > > >> broken out-of-the-box. > > > > >> > > > > >> Regards, > > > > >> David > > > > >> > > > > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann < > [hidden email]> > > > > >> wrote: > > > > >> > > > > >> > Hi everyone, > > > > >> > > > > > >> > thanks for starting this discussion Chesnay. > > > > >> > > > > > >> > I think it would be nice if we also displayed the logs when > > starting > > > > the > > > > >> > process in the foreground. > > > > >> > > > > > >> > The repercussions could be mitigated if the default logger > > > > >> configurations > > > > >> > would contain file rolling with a max log file size. > > > > >> > > > > > >> > @Yang I think there are solutions how to redirect stdout and > > stderr > > > > into > > > > >> > separate files using tee without duplication [1]. > > > > >> > > > > > >> > [1] http://www.softpanorama.org/Tools/tee.shtml > > > > >> > > > > > >> > Cheers, > > > > >> > Till > > > > >> > > > > > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang < > [hidden email]> > > > > >> wrote: > > > > >> > > > > > >> > > Thanks for Chesnay starting this discussion. > > > > >> > > > > > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee" > > > instead > > > > >> of > > > > >> > > introducing the stream redirection(redirect the out/err to > > files). > > > > >> > However, > > > > >> > > a side effect is that the logging will be duplicated both in > > .log > > > > and > > > > >> > .out > > > > >> > > files. > > > > >> > > Then it may consume more disk space. However it is not a very > > > > critical > > > > >> > > problem since we could use log4j/logback configuration to > > control > > > > the > > > > >> > > rolling > > > > >> > > files and max size. > > > > >> > > > > > > >> > > Also, it only happens in docker/K8s deployment. For YARN/Mesos > > > > >> > deployment, > > > > >> > > the behavior is just same as before. > > > > >> > > > > > > >> > > > > > > >> > > [1]. https://github.com/apache/flink/pull/11839 > > > > >> > > > > > > >> > > Best, > > > > >> > > Yang > > > > >> > > > > > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 > 上午12:30写道: > > > > >> > > > > > > >> > > > Currently, processes started in the foreground (like in the > > case > > > > of > > > > >> > > > Docker) output all logging/stdout directly to the console, > > > without > > > > >> > > > creating any logging files. > > > > >> > > > > > > > >> > > > The downside of this approach, as outlined in FLIP-111, is > > that > > > > the > > > > >> > > > WebUI is not able to display the logs since it relies on > these > > > > very > > > > >> > > > files to exist. > > > > >> > > > > > > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change > this > > > > such > > > > >> > that > > > > >> > > > we always created .log/.out files. It seems like a > reasonable > > > > >> change to > > > > >> > > > do, but it could have repercussions on existing deployments > > > since > > > > we > > > > >> > > > will naturally use more disk space (logs gotta go > somewhere). > > > > >> > > > > > > > >> > > > I'm curious what people think about this. > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > -- > > > > > > Arvid Heise | Senior Java Developer > > > > > > <https://www.ververica.com/> > > > > > > Follow us @VervericaData > > > > > > -- > > > > > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > > > Conference > > > > > > Stream Processing | Event Driven | Real Time > > > > > > -- > > > > > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > > > > > -- > > > Ververica GmbH > > > Registered at Amtsgericht Charlottenburg: HRB 158244 B > > > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji > > > (Toni) Cheng > > > > > > |
Thanks for clarifying, that was not clear to me.
That sounds fine to me, given that it just adds extra information, not changes existing one. On Wed, May 6, 2020 at 9:06 AM Till Rohrmann <[hidden email]> wrote: > Just for clarifications and as Yang already pointed out: The discussion > here is about also creating the log, out and err files as well as keeping > writing to STDOUT and STDERR. > > Hence, there should be no regression for K8s users. The main problem, as > Chesnay pointed out, could be the increased disk usage by creating these > files. > > Cheers, > Till > > On Wed, May 6, 2020 at 5:10 AM Yang Wang <[hidden email]> wrote: > > > Hi Thomas Weise, > > > > I am not sure why this change will break the existing K8s/container-based > > setup. > > Since it will output the logs to stdout and log files at the same time, > you > > could still > > use the `kubectl logs` to view the logs. And log collection could just > work > > as before. > > > > What we could benefit from this is the logs could also be accessed via > > Flink web > > dashboard. I think it is more convenient when the users do not have the > > permissions > > to execute `kubectl`. > > > > > > @Till Rohrmann <[hidden email]>, it will be great if we could use > > tee > > to side output logs to file and avoid > > duplication. I will have a look. > > > > > > Best, > > Yang > > > > Thomas Weise <[hidden email]> 于2020年5月6日周三 上午1:40写道: > > > > > The downside of change to the default behavior is the breakage of > > existing > > > k8s or other container-based production setups. > > > > > > Logs that are emitted to stdout are visible through kubectl logs and > > > infrastructure is setup for log collection. > > > > > > Thanks, > > > Thomas > > > > > > > > > On Tue, May 5, 2020 at 6:31 AM Arvid Heise <[hidden email]> > wrote: > > > > > > > From my previous experience with K8s, I'd assume that the cluster > > itself > > > > already has some ELK attached to it and all stdout/err is collected > > > > automatically. > > > > > > > > So if you want to also add logs, I'd make that configurable and I'm > > torn > > > > what the default should be as both make sense. > > > > > > > > However, thinking a bit further, it sounds as if the current way is > > > already > > > > very inconsistent. We should probably either have log files for > > > everything > > > > or log everything to stdout/err and let K8s deal with it. And then we > > > > should have some configuration to toggle between the modes. > > > > > > > > On Tue, May 5, 2020 at 2:50 PM Stephan Ewen <[hidden email]> > wrote: > > > > > > > > > @Patrick could you chime in? > > > > > > > > > > We should at least understand the original motivation before simply > > > > > changing the way it works. > > > > > > > > > > On Tue, May 5, 2020 at 2:49 PM Stephan Ewen <[hidden email]> > > wrote: > > > > > > > > > > > I think Patrick originally introduced the foreground mode, and I > > > > believe > > > > > > it had indeed something to do with container use and logging. > > > > > > > > > > > > IIRC the default assumption in docker and Kubernetes is that the > > logs > > > > > come > > > > > > on stdout (or stderr) so after "principle of least astonishment" > > the > > > > idea > > > > > > was to give a similar experience with Flink. > > > > > > > > > > > > On Tue, May 5, 2020 at 10:49 AM David Anderson < > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > >> I like this idea because it should improve the experience (and > > > reduce > > > > > >> confusion) for folks having their first flink experience via one > > of > > > > the > > > > > >> docker playgrounds. Right now it gives the impression that > > something > > > > is > > > > > >> broken out-of-the-box. > > > > > >> > > > > > >> Regards, > > > > > >> David > > > > > >> > > > > > >> On Mon, May 4, 2020 at 6:01 PM Till Rohrmann < > > [hidden email]> > > > > > >> wrote: > > > > > >> > > > > > >> > Hi everyone, > > > > > >> > > > > > > >> > thanks for starting this discussion Chesnay. > > > > > >> > > > > > > >> > I think it would be nice if we also displayed the logs when > > > starting > > > > > the > > > > > >> > process in the foreground. > > > > > >> > > > > > > >> > The repercussions could be mitigated if the default logger > > > > > >> configurations > > > > > >> > would contain file rolling with a max log file size. > > > > > >> > > > > > > >> > @Yang I think there are solutions how to redirect stdout and > > > stderr > > > > > into > > > > > >> > separate files using tee without duplication [1]. > > > > > >> > > > > > > >> > [1] http://www.softpanorama.org/Tools/tee.shtml > > > > > >> > > > > > > >> > Cheers, > > > > > >> > Till > > > > > >> > > > > > > >> > On Wed, Apr 29, 2020 at 4:28 AM Yang Wang < > > [hidden email]> > > > > > >> wrote: > > > > > >> > > > > > > >> > > Thanks for Chesnay starting this discussion. > > > > > >> > > > > > > > >> > > In FLINK-17166 implementation[1], we are trying to use "tee" > > > > instead > > > > > >> of > > > > > >> > > introducing the stream redirection(redirect the out/err to > > > files). > > > > > >> > However, > > > > > >> > > a side effect is that the logging will be duplicated both in > > > .log > > > > > and > > > > > >> > .out > > > > > >> > > files. > > > > > >> > > Then it may consume more disk space. However it is not a > very > > > > > critical > > > > > >> > > problem since we could use log4j/logback configuration to > > > control > > > > > the > > > > > >> > > rolling > > > > > >> > > files and max size. > > > > > >> > > > > > > > >> > > Also, it only happens in docker/K8s deployment. For > YARN/Mesos > > > > > >> > deployment, > > > > > >> > > the behavior is just same as before. > > > > > >> > > > > > > > >> > > > > > > > >> > > [1]. https://github.com/apache/flink/pull/11839 > > > > > >> > > > > > > > >> > > Best, > > > > > >> > > Yang > > > > > >> > > > > > > > >> > > Chesnay Schepler <[hidden email]> 于2020年4月29日周三 > > 上午12:30写道: > > > > > >> > > > > > > > >> > > > Currently, processes started in the foreground (like in > the > > > case > > > > > of > > > > > >> > > > Docker) output all logging/stdout directly to the console, > > > > without > > > > > >> > > > creating any logging files. > > > > > >> > > > > > > > > >> > > > The downside of this approach, as outlined in FLIP-111, is > > > that > > > > > the > > > > > >> > > > WebUI is not able to display the logs since it relies on > > these > > > > > very > > > > > >> > > > files to exist. > > > > > >> > > > > > > > > >> > > > In FLINK-17166 (part of FLIP-111) we are trying to change > > this > > > > > such > > > > > >> > that > > > > > >> > > > we always created .log/.out files. It seems like a > > reasonable > > > > > >> change to > > > > > >> > > > do, but it could have repercussions on existing > deployments > > > > since > > > > > we > > > > > >> > > > will naturally use more disk space (logs gotta go > > somewhere). > > > > > >> > > > > > > > > >> > > > I'm curious what people think about this. > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Arvid Heise | Senior Java Developer > > > > > > > > <https://www.ververica.com/> > > > > > > > > Follow us @VervericaData > > > > > > > > -- > > > > > > > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > > > > Conference > > > > > > > > Stream Processing | Event Driven | Real Time > > > > > > > > -- > > > > > > > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > > > > > > > -- > > > > Ververica GmbH > > > > Registered at Amtsgericht Charlottenburg: HRB 158244 B > > > > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, > Ji > > > > (Toni) Cheng > > > > > > > > > > |
Free forum by Nabble | Edit this page |