[Possible Bug] Savepoints Akka Timeout

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Possible Bug] Savepoints Akka Timeout

Sayat Satybaldiyev
Dear Flink Community!

I've initially posted my question on SO:
https://stackoverflow.com/questions/52422499/flink-savepoints-akka-pattern-asktimeoutexception-ask-timed-out-on-actorakka

However, after further investigation, I think it's the issue with Flink
1.6.0 cluster rather me doing something wrong.

In nutshell, I'm trying to create a savepoint in Flink job with a rocksdb
backend. However, flink cli give me error: CompletionException:
akka.pattern.AskTimeoutException: Ask timed out on
[Actor[akka://flink/user/jobmanager_0#1140973613]] after [300000 ms].
Sender[null] sent message of type
"org.apache.flink.runtime.rpc.messages.LocalFencedMessage"

I've checked JM and TM logs and found that TM is doing savepoints of
operators without an exception and in couple seconds.

flink savepoint --jobmanager xx-xx-5:8081 e569cf53baecae9cb4fa794d590d670f
hdfs://foundationhdfs/user/flink/savepoint3

I've grepped by savepoint3 and savepoint4 and everything looks good to me.

Could anyone please to help me understand if it's a bug or feature of Flink
1.6.0? ;)

Flink JM & TM logs:
https://drive.google.com/file/d/1Mg0qKJDOkYY14iM_gNO4QYjLB81JGzig/view?usp=sharing
https://drive.google.com/file/d/19l2HDR9bB7NC-SRyxvKY4qAN0BD650uA/view?usp=sharing
Reply | Threaded
Open this post in threaded view
|

Re: [Possible Bug] Savepoints Akka Timeout

Till Rohrmann
Hi Sayat,

I think your problem might be caused by
https://issues.apache.org/jira/browse/FLINK-10193. This will be fixed with
the next bug fix release which will happen in the next days.

In the future, please post these kind of questions to [hidden email].
The dev mailing list is intended for Flink development discussions.

Cheers,
Till

On Thu, Sep 20, 2018 at 2:04 PM Sayat Satybaldiyev <[hidden email]>
wrote:

> Dear Flink Community!
>
> I've initially posted my question on SO:
>
> https://stackoverflow.com/questions/52422499/flink-savepoints-akka-pattern-asktimeoutexception-ask-timed-out-on-actorakka
>
> However, after further investigation, I think it's the issue with Flink
> 1.6.0 cluster rather me doing something wrong.
>
> In nutshell, I'm trying to create a savepoint in Flink job with a rocksdb
> backend. However, flink cli give me error: CompletionException:
> akka.pattern.AskTimeoutException: Ask timed out on
> [Actor[akka://flink/user/jobmanager_0#1140973613]] after [300000 ms].
> Sender[null] sent message of type
> "org.apache.flink.runtime.rpc.messages.LocalFencedMessage"
>
> I've checked JM and TM logs and found that TM is doing savepoints of
> operators without an exception and in couple seconds.
>
> flink savepoint --jobmanager xx-xx-5:8081 e569cf53baecae9cb4fa794d590d670f
> hdfs://foundationhdfs/user/flink/savepoint3
>
> I've grepped by savepoint3 and savepoint4 and everything looks good to me.
>
> Could anyone please to help me understand if it's a bug or feature of Flink
> 1.6.0? ;)
>
> Flink JM & TM logs:
>
> https://drive.google.com/file/d/1Mg0qKJDOkYY14iM_gNO4QYjLB81JGzig/view?usp=sharing
>
> https://drive.google.com/file/d/19l2HDR9bB7NC-SRyxvKY4qAN0BD650uA/view?usp=sharing
>