(DEPRECATED) Apache Flink Mailing List archive.

Make SubmittedJobGraphStore configurable

Classic

List

Threaded

5 messages Options

chenqin

Make SubmittedJobGraphStore configurable

Hi there,

I would like to propose/discuss median level refactor work to make
submittedJobGraphStore configurable and extensible.

The rationale behind is to allow users offload those meta data to durable
cross dc read after write strong consistency storage and decouple with zk
quorum.

https://issues.apache.org/jira/browse/FLINK-7106

<https://issues.apache.org/jira/browse/FLINK-7106>
New configurable setting in flink.conf
looks like following

g
raph
-s
tore:
customized/zookeeper
g
raph
-s
tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp

g
raph
-s
tore.
endpoint
: s3.amazonaws.com
g
raph
-s
tore.path.root:
s3:/

/
my root/

Thanks,
Chen

Ted Yu

Re: Make SubmittedJobGraphStore configurable

The sample config entries are broken into multiple lines.

Can you send the config again with one config on one line ?

Cheers

On Wed, Jul 5, 2017 at 10:19 PM, Chen Qin <[hidden email]> wrote:

> Hi there,
>
> I would like to propose/discuss median level refactor work to make
> submittedJobGraphStore configurable and extensible.
>
> The rationale behind is to allow users offload those meta data to durable
> cross dc read after write strong consistency storage and decouple with zk
> quorum.
>
>
> https://issues.apache.org/jira/browse/FLINK-7106
>
> <https://issues.apache.org/jira/browse/FLINK-7106>
> New configurable setting in flink.conf
> looks like following
>
> g
> raph
> -s
> tore:
> customized/zookeeper
> g
> raph
> -s
> tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp
>
> g
> raph
> -s
> tore.
> endpoint
> : s3.amazonaws.com
> g
> raph
> -s
> tore.path.root:
> s3:/
>
> /
> my root/
>
> Thanks,
> Chen
>

chenqin

Re: Make SubmittedJobGraphStore configurable

Sure,
I would imagine couple of extra lines within flink.conf
...
graphstore.type: customized/zookeeper
graphstore.class:
org
.
apache.flink.contrib
.MyS3SubmittedJobGraphStoreImp
graphstore.endpoint: s3.amazonaws.com
graphstore.path.root: s3://my root/

which overwrites initiation of

*org.apache.flink.runtime.highavailability.HighAvailabilityServices*

/**
* Gets the submitted job graph store for the job manager
*
* @return Submitted job graph store
* @throws Exception if the submitted job graph store could not be created
*/

SubmittedJobGraphStore *getSubmittedJobGraphStore*() throws Exception;

In this case, user implemented their own s3 backed job graph store and
stores job graphs in s3 instead of zookeeper(high availability) or
never(nonha)

I find [1] is somehow related and more focus on life cycle and dependency
aspect of graph-store and checkpoint-store. FLINK-7106 in this case limited
to enable user implemented their own jobgraphstore instead of hardcoded to
zookeeper.

Thanks,
Chen

[1] https://issues.apache.org/jira/browse/FLINK-6626

On Thu, Jul 6, 2017 at 2:47 AM, Ted Yu <[hidden email]> wrote:

> The sample config entries are broken into multiple lines.
>
> Can you send the config again with one config on one line ?
>
> Cheers
>
> On Wed, Jul 5, 2017 at 10:19 PM, Chen Qin <[hidden email]> wrote:
>
> > Hi there,
> >
> > I would like to propose/discuss median level refactor work to make
> > submittedJobGraphStore configurable and extensible.
> >
> > The rationale behind is to allow users offload those meta data to durable
> > cross dc read after write strong consistency storage and decouple with zk
> > quorum.
> >
> >
> > https://issues.apache.org/jira/browse/FLINK-7106
> >
> > <https://issues.apache.org/jira/browse/FLINK-7106>
> > New configurable setting in flink.conf
> > looks like following
> >
> > g
> > raph
> > -s
> > tore:
> > customized/zookeeper
> > g
> > raph
> > -s
> > tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp
> >
> > g
> > raph
> > -s
> > tore.
> > endpoint
> > : s3.amazonaws.com
> > g
> > raph
> > -s
> > tore.path.root:
> > s3:/
> >
> > /
> > my root/
> >
> > Thanks,
> > Chen
> >
>

Till Rohrmann

Re: Make SubmittedJobGraphStore configurable

If there is a need for this, then we can definitely make this configurable.
The interface SubmittedJobGraphStore is already there.

Cheers,
Till

On Fri, Jul 7, 2017 at 6:32 AM, Chen Qin <[hidden email]> wrote:

> Sure,
> I would imagine couple of extra lines within flink.conf
> ...
> graphstore.type: customized/zookeeper
> graphstore.class:
> org
> .
> apache.flink.contrib
> .MyS3SubmittedJobGraphStoreImp
> graphstore.endpoint: s3.amazonaws.com
> graphstore.path.root: s3://my root/
>
> which overwrites initiation of
>
> *org.apache.flink.runtime.highavailability.HighAvailabilityServices*
>
> /**
> * Gets the submitted job graph store for the job manager
> *
> * @return Submitted job graph store
> * @throws Exception if the submitted job graph store could not be created
> */
>
> SubmittedJobGraphStore *getSubmittedJobGraphStore*() throws Exception;
>
> In this case, user implemented their own s3 backed job graph store and
> stores job graphs in s3 instead of zookeeper(high availability) or
> never(nonha)
>
> I find [1] is somehow related and more focus on life cycle and dependency
> aspect of graph-store and checkpoint-store. FLINK-7106 in this case limited
> to enable user implemented their own jobgraphstore instead of hardcoded to
> zookeeper.
>
> Thanks,
> Chen
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-6626
>
>
> On Thu, Jul 6, 2017 at 2:47 AM, Ted Yu <[hidden email]> wrote:
>
> > The sample config entries are broken into multiple lines.
> >
> > Can you send the config again with one config on one line ?
> >
> > Cheers
> >
> > On Wed, Jul 5, 2017 at 10:19 PM, Chen Qin <[hidden email]> wrote:
> >
> > > Hi there,
> > >
> > > I would like to propose/discuss median level refactor work to make
> > > submittedJobGraphStore configurable and extensible.
> > >
> > > The rationale behind is to allow users offload those meta data to
> durable
> > > cross dc read after write strong consistency storage and decouple with
> zk
> > > quorum.
> > >
> > >
> > > https://issues.apache.org/jira/browse/FLINK-7106
> > >
> > > <https://issues.apache.org/jira/browse/FLINK-7106>
> > > New configurable setting in flink.conf
> > > looks like following
> > >
> > > g
> > > raph
> > > -s
> > > tore:
> > > customized/zookeeper
> > > g
> > > raph
> > > -s
> > > tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp
> > >
> > > g
> > > raph
> > > -s
> > > tore.
> > > endpoint
> > > : s3.amazonaws.com
> > > g
> > > raph
> > > -s
> > > tore.path.root:
> > > s3:/
> > >
> > > /
> > > my root/
> > >
> > > Thanks,
> > > Chen
> > >
> >
>

chenqin

Re: Make SubmittedJobGraphStore configurable

Hi Till,

As far as I know there is interests of keep job graphs recoverable from shared zk hiccups. Or standalone mode with customized leader election.

I plan to spend a bit time prototyping back up to Amazon S3. Will keep folks updated as along as I got happy pass going.

Thanks,
Chen

> On Jul 25, 2017, at 6:07 AM, Till Rohrmann <[hidden email]> wrote:
>
> If there is a need for this, then we can definitely make this configurable.
> The interface SubmittedJobGraphStore is already there.
>
> Cheers,
> Till
>
>
>> On Fri, Jul 7, 2017 at 6:32 AM, Chen Qin <[hidden email]> wrote:
>>
>> Sure,
>> I would imagine couple of extra lines within flink.conf
>> ...
>> graphstore.type: customized/zookeeper
>> graphstore.class:
>> org
>> .
>> apache.flink.contrib
>> .MyS3SubmittedJobGraphStoreImp
>> graphstore.endpoint: s3.amazonaws.com
>> graphstore.path.root: s3://my root/
>>
>> which overwrites initiation of
>>
>> *org.apache.flink.runtime.highavailability.HighAvailabilityServices*
>>
>> /**
>> * Gets the submitted job graph store for the job manager
>> *
>> * @return Submitted job graph store
>> * @throws Exception if the submitted job graph store could not be created
>> */
>>
>> SubmittedJobGraphStore *getSubmittedJobGraphStore*() throws Exception;
>>
>> In this case, user implemented their own s3 backed job graph store and
>> stores job graphs in s3 instead of zookeeper(high availability) or
>> never(nonha)
>>
>> I find [1] is somehow related and more focus on life cycle and dependency
>> aspect of graph-store and checkpoint-store. FLINK-7106 in this case limited
>> to enable user implemented their own jobgraphstore instead of hardcoded to
>> zookeeper.
>>
>> Thanks,
>> Chen
>>
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-6626
>>
>>
>>> On Thu, Jul 6, 2017 at 2:47 AM, Ted Yu <[hidden email]> wrote:
>>>
>>> The sample config entries are broken into multiple lines.
>>>
>>> Can you send the config again with one config on one line ?
>>>
>>> Cheers
>>>
>>>> On Wed, Jul 5, 2017 at 10:19 PM, Chen Qin <[hidden email]> wrote:
>>>>
>>>> Hi there,
>>>>
>>>> I would like to propose/discuss median level refactor work to make
>>>> submittedJobGraphStore configurable and extensible.
>>>>
>>>> The rationale behind is to allow users offload those meta data to
>> durable
>>>> cross dc read after write strong consistency storage and decouple with
>> zk
>>>> quorum.
>>>>
>>>>
>>>> https://issues.apache.org/jira/browse/FLINK-7106
>>>>
>>>> <https://issues.apache.org/jira/browse/FLINK-7106>
>>>> New configurable setting in flink.conf
>>>> looks like following
>>>>
>>>> g
>>>> raph
>>>> -s
>>>> tore:
>>>> customized/zookeeper
>>>> g
>>>> raph
>>>> -s
>>>> tore.class: xx.yy.MyS3SubmittedJobGraphStoreImp
>>>>
>>>> g
>>>> raph
>>>> -s
>>>> tore.
>>>> endpoint
>>>> : s3.amazonaws.com
>>>> g
>>>> raph
>>>> -s
>>>> tore.path.root:
>>>> s3:/
>>>>
>>>> /
>>>> my root/
>>>>
>>>> Thanks,
>>>> Chen
>>>>
>>>
>>