Have trouble on running flink

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Have trouble on running flink

Russell Bie
Hi Flink team,

I am trying to submit flink job (version 1.8.2) with RocksDB backend to my own yarn cluster (hadoop version 2.6.0-cdh5.7.3), the job always failed after running for a few hours with the connection loss of some taskmanagers. Here<https://stackoverflow.com/questions/58046847/ioexception-when-taskmanager-restored-from-rocksdb-state-in-hdfs> is the question details on the stackoverflow. I am just wondering if you could provide some advice on this issue?

Thanks,
Russell

Reply | Threaded
Open this post in threaded view
|

Re: Have trouble on running flink

Biao Liu
Hi Russell,

I don't think `BackendBuildingException` is root cause. In your case, this
exception appears when task is under cancelling.

Have you ever checked the log of yarn node manager? There should be an exit
code of container. Even more the container is probably killed by yarn node
manager.

BTW, I think we should discuss this in flink-user mailing list, not dev
mailing list. Will forward this mail there.

Thanks,
Biao /'bɪ.aʊ/



On Tue, 24 Sep 2019 at 19:19, Russell Bie <[hidden email]> wrote:

> Hi Flink team,
>
> I am trying to submit flink job (version 1.8.2) with RocksDB backend to my
> own yarn cluster (hadoop version 2.6.0-cdh5.7.3), the job always failed
> after running for a few hours with the connection loss of some
> taskmanagers. Here<
> https://stackoverflow.com/questions/58046847/ioexception-when-taskmanager-restored-from-rocksdb-state-in-hdfs>
> is the question details on the stackoverflow. I am just wondering if you
> could provide some advice on this issue?
>
> Thanks,
> Russell
>
>