HA Cluster restart behaviour

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

HA Cluster restart behaviour

Gyula Fóra
Hi,

I have noticed some strange behaviour on a streaming cluster running in HA
mode.

I have stopped a cluster with some deployed jobs (stop-cluster.sh without
cancelling the jobs) and when I bring the cluster back up the jobs that
were running before are restarted.

Is this the expected behaviour? It feels strange that jobs will be
automatically redeployed after specifically calling stop-cluster.

Regards,
Gyula
Reply | Threaded
Open this post in threaded view
|

Re: HA Cluster restart behaviour

Ufuk Celebi-2
Yes, it's expected, but you are certainly not the first one to be
confused by this behaviour.

The reasoning behind the current behaviour is that we don't users
accidentally removing jobs, which seems worse than requiring users to
cancel manually. We thought about adding a flag to the start scripts
to either clear the jobs on start up or shut down. What's your opinion
on this?

– Ufuk

On Wed, Jun 1, 2016 at 1:42 PM, Gyula Fóra <[hidden email]> wrote:

> Hi,
>
> I have noticed some strange behaviour on a streaming cluster running in HA
> mode.
>
> I have stopped a cluster with some deployed jobs (stop-cluster.sh without
> cancelling the jobs) and when I bring the cluster back up the jobs that
> were running before are restarted.
>
> Is this the expected behaviour? It feels strange that jobs will be
> automatically redeployed after specifically calling stop-cluster.
>
> Regards,
> Gyula
Reply | Threaded
Open this post in threaded view
|

Re: HA Cluster restart behaviour

Márton Balassi
I also think that the current mechanism is weird. IMHO it makes sense to
add the flag to both the start and stop scripts.

On Wed, Jun 1, 2016 at 2:09 PM, Ufuk Celebi <[hidden email]> wrote:

> Yes, it's expected, but you are certainly not the first one to be
> confused by this behaviour.
>
> The reasoning behind the current behaviour is that we don't users
> accidentally removing jobs, which seems worse than requiring users to
> cancel manually. We thought about adding a flag to the start scripts
> to either clear the jobs on start up or shut down. What's your opinion
> on this?
>
> – Ufuk
>
> On Wed, Jun 1, 2016 at 1:42 PM, Gyula Fóra <[hidden email]> wrote:
> > Hi,
> >
> > I have noticed some strange behaviour on a streaming cluster running in
> HA
> > mode.
> >
> > I have stopped a cluster with some deployed jobs (stop-cluster.sh without
> > cancelling the jobs) and when I bring the cluster back up the jobs that
> > were running before are restarted.
> >
> > Is this the expected behaviour? It feels strange that jobs will be
> > automatically redeployed after specifically calling stop-cluster.
> >
> > Regards,
> > Gyula
>
Reply | Threaded
Open this post in threaded view
|

Re: HA Cluster restart behaviour

Gyula Fóra
So you mean that you don't want people accidentally remove all jobs by
shutting down the cluster? I think it is a bigger risk that people will
actually click the cancel button on the website by accident :D

For me it would seem intuitive that when I stop the cluster it stops the
jobs. Definitely a flag would help to make sure the jobs are cleared but I
am not sure what the default behaviour should be.

Gyula

Márton Balassi <[hidden email]> ezt írta (időpont: 2016. jún. 1.,
Sze, 14:14):

> I also think that the current mechanism is weird. IMHO it makes sense to
> add the flag to both the start and stop scripts.
>
> On Wed, Jun 1, 2016 at 2:09 PM, Ufuk Celebi <[hidden email]> wrote:
>
> > Yes, it's expected, but you are certainly not the first one to be
> > confused by this behaviour.
> >
> > The reasoning behind the current behaviour is that we don't users
> > accidentally removing jobs, which seems worse than requiring users to
> > cancel manually. We thought about adding a flag to the start scripts
> > to either clear the jobs on start up or shut down. What's your opinion
> > on this?
> >
> > – Ufuk
> >
> > On Wed, Jun 1, 2016 at 1:42 PM, Gyula Fóra <[hidden email]> wrote:
> > > Hi,
> > >
> > > I have noticed some strange behaviour on a streaming cluster running in
> > HA
> > > mode.
> > >
> > > I have stopped a cluster with some deployed jobs (stop-cluster.sh
> without
> > > cancelling the jobs) and when I bring the cluster back up the jobs that
> > > were running before are restarted.
> > >
> > > Is this the expected behaviour? It feels strange that jobs will be
> > > automatically redeployed after specifically calling stop-cluster.
> > >
> > > Regards,
> > > Gyula
> >
>
mxm
Reply | Threaded
Open this post in threaded view
|

Re: HA Cluster restart behaviour

mxm
At the moment, the stop-cluster script simply sends a TERM signal to
all processes using "kill". Shutting down the cluster cleanly is a bit
more complicated and would block for a longer time. I think the
current approach is the safest for most users. I agree that it would
be nice to have an option to shutdown cleanly.

On Wed, Jun 1, 2016 at 2:21 PM, Gyula Fóra <[hidden email]> wrote:

> So you mean that you don't want people accidentally remove all jobs by
> shutting down the cluster? I think it is a bigger risk that people will
> actually click the cancel button on the website by accident :D
>
> For me it would seem intuitive that when I stop the cluster it stops the
> jobs. Definitely a flag would help to make sure the jobs are cleared but I
> am not sure what the default behaviour should be.
>
> Gyula
>
> Márton Balassi <[hidden email]> ezt írta (időpont: 2016. jún. 1.,
> Sze, 14:14):
>
>> I also think that the current mechanism is weird. IMHO it makes sense to
>> add the flag to both the start and stop scripts.
>>
>> On Wed, Jun 1, 2016 at 2:09 PM, Ufuk Celebi <[hidden email]> wrote:
>>
>> > Yes, it's expected, but you are certainly not the first one to be
>> > confused by this behaviour.
>> >
>> > The reasoning behind the current behaviour is that we don't users
>> > accidentally removing jobs, which seems worse than requiring users to
>> > cancel manually. We thought about adding a flag to the start scripts
>> > to either clear the jobs on start up or shut down. What's your opinion
>> > on this?
>> >
>> > – Ufuk
>> >
>> > On Wed, Jun 1, 2016 at 1:42 PM, Gyula Fóra <[hidden email]> wrote:
>> > > Hi,
>> > >
>> > > I have noticed some strange behaviour on a streaming cluster running in
>> > HA
>> > > mode.
>> > >
>> > > I have stopped a cluster with some deployed jobs (stop-cluster.sh
>> without
>> > > cancelling the jobs) and when I bring the cluster back up the jobs that
>> > > were running before are restarted.
>> > >
>> > > Is this the expected behaviour? It feels strange that jobs will be
>> > > automatically redeployed after specifically calling stop-cluster.
>> > >
>> > > Regards,
>> > > Gyula
>> >
>>