(DEPRECATED) Apache Flink Mailing List archive.

Change in the JobManager API

Classic

List

Threaded

4 messages Options

Stephan Ewen

Change in the JobManager API

Hi!

I have just pushed a big patch to rework the JobManager job and scheduling
classes. It fixes some scalability and robstness issues,
simplifies the task hierarchies, and makes the code ready for some of the
prepared next features (incremental/interactive jobs).

The pull request is https://github.com/apache/incubator-flink/pull/122

What will affect developers that go against the lower level APIs (like the
streaming parts) is the following:

- No more distrinction between input/intermediate/output tasks
- Intermediate data sets have a data structure now. This implies that some
methods change slightly (more in name than in meaning).
In the future, data sets can be consumed many times, but for now, the
network stack supports only one cosumer.
- The conceptual change that receivers attach senders as inputs (and grab
their outgoing data streams), rather than senders forwarding to
receivers means that the wiring of JobGraphs is now the other way around.
- No more distinction between in-memory and network channels. All channels
have always been automatically in-memory, when senders
and receiver are co-located. The flag was purely a scheduler hint, which
is obsolete now (see below).

Most importantly:
- The scheduling is a bit different now. Instread of instance sharing, we
now have SlotSharing Groups, which give you
a way to share resources across tasks, but they behave more dynamic,
which is important for more dynamic environments,
and when a cluster has less task slots than the parallelism of some
tasks is.
- For cases that need strict co-location of tasks, we now have
CoLocationConstraints. The Batch API uses them to ensure that
head, tail, and tasks inside a closed-loop iteration are co-located.

Stephan

Stephan Ewen

Re: Change in the JobManager API

Edit: I have not pushed it, I am about to push ;-)

Just needed to rebase on the latest master an tests are pending...

On Sat, Sep 20, 2014 at 8:24 PM, Stephan Ewen <[hidden email]> wrote:

> Hi!
>
> I have just pushed a big patch to rework the JobManager job and scheduling
> classes. It fixes some scalability and robstness issues,
> simplifies the task hierarchies, and makes the code ready for some of the
> prepared next features (incremental/interactive jobs).
>
> The pull request is https://github.com/apache/incubator-flink/pull/122
>
> What will affect developers that go against the lower level APIs (like the
> streaming parts) is the following:
>
> - No more distrinction between input/intermediate/output tasks
> - Intermediate data sets have a data structure now. This implies that
> some methods change slightly (more in name than in meaning).
> In the future, data sets can be consumed many times, but for now, the
> network stack supports only one cosumer.
> - The conceptual change that receivers attach senders as inputs (and grab
> their outgoing data streams), rather than senders forwarding to
> receivers means that the wiring of JobGraphs is now the other way
> around.
> - No more distinction between in-memory and network channels. All
> channels have always been automatically in-memory, when senders
> and receiver are co-located. The flag was purely a scheduler hint,
> which is obsolete now (see below).
>
>
> Most importantly:
> - The scheduling is a bit different now. Instread of instance sharing, we
> now have SlotSharing Groups, which give you
> a way to share resources across tasks, but they behave more dynamic,
> which is important for more dynamic environments,
> and when a cluster has less task slots than the parallelism of some
> tasks is.
> - For cases that need strict co-location of tasks, we now have
> CoLocationConstraints. The Batch API uses them to ensure that
> head, tail, and tasks inside a closed-loop iteration are co-located.
>
> Stephan
>
>

Stephan Ewen

Re: Change in the JobManager API

There is one more affect of the changes: Since there is no more distinction
between input/output vertices and since disconnected flows are also
accepted now, the job manager will not reject any more certain graphs that
it used to reject.

That is actually desirable, but I think the streaming API made use of that
behavior to validate that the programs have at least a connected source and
sink.

This need checks at a different point now.

On Sat, Sep 20, 2014 at 8:25 PM, Stephan Ewen <[hidden email]> wrote:

> Edit: I have not pushed it, I am about to push ;-)
>
> Just needed to rebase on the latest master an tests are pending...
>
> On Sat, Sep 20, 2014 at 8:24 PM, Stephan Ewen <[hidden email]> wrote:
>
>> Hi!
>>
>> I have just pushed a big patch to rework the JobManager job and
>> scheduling classes. It fixes some scalability and robstness issues,
>> simplifies the task hierarchies, and makes the code ready for some of the
>> prepared next features (incremental/interactive jobs).
>>
>> The pull request is https://github.com/apache/incubator-flink/pull/122
>>
>> What will affect developers that go against the lower level APIs (like
>> the streaming parts) is the following:
>>
>> - No more distrinction between input/intermediate/output tasks
>> - Intermediate data sets have a data structure now. This implies that
>> some methods change slightly (more in name than in meaning).
>> In the future, data sets can be consumed many times, but for now, the
>> network stack supports only one cosumer.
>> - The conceptual change that receivers attach senders as inputs (and
>> grab their outgoing data streams), rather than senders forwarding to
>> receivers means that the wiring of JobGraphs is now the other way
>> around.
>> - No more distinction between in-memory and network channels. All
>> channels have always been automatically in-memory, when senders
>> and receiver are co-located. The flag was purely a scheduler hint,
>> which is obsolete now (see below).
>>
>>
>> Most importantly:
>> - The scheduling is a bit different now. Instread of instance sharing,
>> we now have SlotSharing Groups, which give you
>> a way to share resources across tasks, but they behave more dynamic,
>> which is important for more dynamic environments,
>> and when a cluster has less task slots than the parallelism of some
>> tasks is.
>> - For cases that need strict co-location of tasks, we now have
>> CoLocationConstraints. The Batch API uses them to ensure that
>> head, tail, and tasks inside a closed-loop iteration are co-located.
>>
>> Stephan
>>
>>
>
>

Márton Balassi-3

Re: Change in the JobManager API

Thanks for pointing that out.
I personally prefer that this way it is not necessary to explicitly "close"
a DataSet or DataStream with a sink. We need to update the corresponding
tests however.

On Sun, Sep 21, 2014 at 6:49 PM, Stephan Ewen <[hidden email]> wrote:

> There is one more affect of the changes: Since there is no more
> distinction between input/output vertices and since disconnected flows are
> also accepted now, the job manager will not reject any more certain graphs
> that it used to reject.
>
> That is actually desirable, but I think the streaming API made use of that
> behavior to validate that the programs have at least a connected source and
> sink.
>
> This need checks at a different point now.
>
>
> On Sat, Sep 20, 2014 at 8:25 PM, Stephan Ewen <[hidden email]> wrote:
>
>> Edit: I have not pushed it, I am about to push ;-)
>>
>> Just needed to rebase on the latest master an tests are pending...
>>
>> On Sat, Sep 20, 2014 at 8:24 PM, Stephan Ewen <[hidden email]> wrote:
>>
>>> Hi!
>>>
>>> I have just pushed a big patch to rework the JobManager job and
>>> scheduling classes. It fixes some scalability and robstness issues,
>>> simplifies the task hierarchies, and makes the code ready for some of
>>> the prepared next features (incremental/interactive jobs).
>>>
>>> The pull request is https://github.com/apache/incubator-flink/pull/122
>>>
>>> What will affect developers that go against the lower level APIs (like
>>> the streaming parts) is the following:
>>>
>>> - No more distrinction between input/intermediate/output tasks
>>> - Intermediate data sets have a data structure now. This implies that
>>> some methods change slightly (more in name than in meaning).
>>> In the future, data sets can be consumed many times, but for now, the
>>> network stack supports only one cosumer.
>>> - The conceptual change that receivers attach senders as inputs (and
>>> grab their outgoing data streams), rather than senders forwarding to
>>> receivers means that the wiring of JobGraphs is now the other way
>>> around.
>>> - No more distinction between in-memory and network channels. All
>>> channels have always been automatically in-memory, when senders
>>> and receiver are co-located. The flag was purely a scheduler hint,
>>> which is obsolete now (see below).
>>>
>>>
>>> Most importantly:
>>> - The scheduling is a bit different now. Instread of instance sharing,
>>> we now have SlotSharing Groups, which give you
>>> a way to share resources across tasks, but they behave more dynamic,
>>> which is important for more dynamic environments,
>>> and when a cluster has less task slots than the parallelism of some
>>> tasks is.
>>> - For cases that need strict co-location of tasks, we now have
>>> CoLocationConstraints. The Batch API uses them to ensure that
>>> head, tail, and tasks inside a closed-loop iteration are co-located.
>>>
>>> Stephan
>>>
>>>
>>
>>
>