[HA] Is it possible to remove external dependency caused by high-availability.zookeeper.storageDir

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[HA] Is it possible to remove external dependency caused by high-availability.zookeeper.storageDir

Hao Chen
Hi,

As for lots of companies, we have different architectures for
near-time/off-line analytics platform (fink vs hadoop), so we may not
want flink cluster to depend on a shared HDFS as an additional dependency
to operate especially we may deploy the streaming stack all on k8s
including kafka/flink without hdfs.

Is it possible to implement a way to sync metadata across different HA job
managers instead of depending on a shared file system
for high-availability.zookeeper.storageDir.


--

Hao
Reply | Threaded
Open this post in threaded view
|

Re: [HA] Is it possible to remove external dependency caused by high-availability.zookeeper.storageDir

Fabian Hueske-2
Hi Hao,

You do not necessarily need HDFS but some kind of distribute filesystem
that can be accessed from all nodes is required.
Flink doesn't need the FS just for job meta data, but also to store
checkpoints of the application state for fault tolerance.

Best, Fabian

2018-05-08 4:31 GMT+02:00 Hao Chen <[hidden email]>:

> Hi,
>
> As for lots of companies, we have different architectures for
> near-time/off-line analytics platform (fink vs hadoop), so we may not
> want flink cluster to depend on a shared HDFS as an additional dependency
> to operate especially we may deploy the streaming stack all on k8s
> including kafka/flink without hdfs.
>
> Is it possible to implement a way to sync metadata across different HA job
> managers instead of depending on a shared file system
> for high-availability.zookeeper.storageDir.
>
>
> --
>
> Hao
>