(DEPRECATED) Apache Flink Mailing List archive.

HA Cluster restart behaviour

Classic

List

Threaded

5 messages Options

Gyula Fóra

HA Cluster restart behaviour

Hi,

I have noticed some strange behaviour on a streaming cluster running in HA
mode.

I have stopped a cluster with some deployed jobs (stop-cluster.sh without
cancelling the jobs) and when I bring the cluster back up the jobs that
were running before are restarted.

Is this the expected behaviour? It feels strange that jobs will be
automatically redeployed after specifically calling stop-cluster.

Regards,
Gyula

Ufuk Celebi-2

Re: HA Cluster restart behaviour

Yes, it's expected, but you are certainly not the first one to be
confused by this behaviour.

The reasoning behind the current behaviour is that we don't users
accidentally removing jobs, which seems worse than requiring users to
cancel manually. We thought about adding a flag to the start scripts
to either clear the jobs on start up or shut down. What's your opinion
on this?

– Ufuk

On Wed, Jun 1, 2016 at 1:42 PM, Gyula Fóra <[hidden email]> wrote:

> Hi,
>
> I have noticed some strange behaviour on a streaming cluster running in HA
> mode.
>
> I have stopped a cluster with some deployed jobs (stop-cluster.sh without
> cancelling the jobs) and when I bring the cluster back up the jobs that
> were running before are restarted.
>
> Is this the expected behaviour? It feels strange that jobs will be
> automatically redeployed after specifically calling stop-cluster.
>
> Regards,
> Gyula

Márton Balassi

Re: HA Cluster restart behaviour

I also think that the current mechanism is weird. IMHO it makes sense to
add the flag to both the start and stop scripts.

On Wed, Jun 1, 2016 at 2:09 PM, Ufuk Celebi <[hidden email]> wrote:

> Yes, it's expected, but you are certainly not the first one to be
> confused by this behaviour.
>
> The reasoning behind the current behaviour is that we don't users
> accidentally removing jobs, which seems worse than requiring users to
> cancel manually. We thought about adding a flag to the start scripts
> to either clear the jobs on start up or shut down. What's your opinion
> on this?
>
> – Ufuk
>
> On Wed, Jun 1, 2016 at 1:42 PM, Gyula Fóra <[hidden email]> wrote:
> > Hi,
> >
> > I have noticed some strange behaviour on a streaming cluster running in
> HA
> > mode.
> >
> > I have stopped a cluster with some deployed jobs (stop-cluster.sh without
> > cancelling the jobs) and when I bring the cluster back up the jobs that
> > were running before are restarted.
> >
> > Is this the expected behaviour? It feels strange that jobs will be
> > automatically redeployed after specifically calling stop-cluster.
> >
> > Regards,
> > Gyula
>

Gyula Fóra

Re: HA Cluster restart behaviour

So you mean that you don't want people accidentally remove all jobs by
shutting down the cluster? I think it is a bigger risk that people will
actually click the cancel button on the website by accident :D

For me it would seem intuitive that when I stop the cluster it stops the
jobs. Definitely a flag would help to make sure the jobs are cleared but I
am not sure what the default behaviour should be.

Gyula

Márton Balassi <[hidden email]> ezt írta (időpont: 2016. jún. 1.,
Sze, 14:14):

> I also think that the current mechanism is weird. IMHO it makes sense to
> add the flag to both the start and stop scripts.
>
> On Wed, Jun 1, 2016 at 2:09 PM, Ufuk Celebi <[hidden email]> wrote:
>
> > Yes, it's expected, but you are certainly not the first one to be
> > confused by this behaviour.
> >
> > The reasoning behind the current behaviour is that we don't users
> > accidentally removing jobs, which seems worse than requiring users to
> > cancel manually. We thought about adding a flag to the start scripts
> > to either clear the jobs on start up or shut down. What's your opinion
> > on this?
> >
> > – Ufuk
> >
> > On Wed, Jun 1, 2016 at 1:42 PM, Gyula Fóra <[hidden email]> wrote:
> > > Hi,
> > >
> > > I have noticed some strange behaviour on a streaming cluster running in
> > HA
> > > mode.
> > >
> > > I have stopped a cluster with some deployed jobs (stop-cluster.sh
> without
> > > cancelling the jobs) and when I bring the cluster back up the jobs that
> > > were running before are restarted.
> > >
> > > Is this the expected behaviour? It feels strange that jobs will be
> > > automatically redeployed after specifically calling stop-cluster.
> > >
> > > Regards,
> > > Gyula
> >
>

mxm

Re: HA Cluster restart behaviour

At the moment, the stop-cluster script simply sends a TERM signal to
all processes using "kill". Shutting down the cluster cleanly is a bit
more complicated and would block for a longer time. I think the
current approach is the safest for most users. I agree that it would
be nice to have an option to shutdown cleanly.

On Wed, Jun 1, 2016 at 2:21 PM, Gyula Fóra <[hidden email]> wrote:

> So you mean that you don't want people accidentally remove all jobs by
> shutting down the cluster? I think it is a bigger risk that people will
> actually click the cancel button on the website by accident :D
>
> For me it would seem intuitive that when I stop the cluster it stops the
> jobs. Definitely a flag would help to make sure the jobs are cleared but I
> am not sure what the default behaviour should be.
>
> Gyula
>
> Márton Balassi <[hidden email]> ezt írta (időpont: 2016. jún. 1.,
> Sze, 14:14):
>
>> I also think that the current mechanism is weird. IMHO it makes sense to
>> add the flag to both the start and stop scripts.
>>
>> On Wed, Jun 1, 2016 at 2:09 PM, Ufuk Celebi <[hidden email]> wrote:
>>
>> > Yes, it's expected, but you are certainly not the first one to be
>> > confused by this behaviour.
>> >
>> > The reasoning behind the current behaviour is that we don't users
>> > accidentally removing jobs, which seems worse than requiring users to
>> > cancel manually. We thought about adding a flag to the start scripts
>> > to either clear the jobs on start up or shut down. What's your opinion
>> > on this?
>> >
>> > – Ufuk
>> >
>> > On Wed, Jun 1, 2016 at 1:42 PM, Gyula Fóra <[hidden email]> wrote:
>> > > Hi,
>> > >
>> > > I have noticed some strange behaviour on a streaming cluster running in
>> > HA
>> > > mode.
>> > >
>> > > I have stopped a cluster with some deployed jobs (stop-cluster.sh
>> without
>> > > cancelling the jobs) and when I bring the cluster back up the jobs that
>> > > were running before are restarted.
>> > >
>> > > Is this the expected behaviour? It feels strange that jobs will be
>> > > automatically redeployed after specifically calling stop-cluster.
>> > >
>> > > Regards,
>> > > Gyula
>> >
>>