Flink Streaming parallelism bug report

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink Streaming parallelism bug report

Szabó Péter
As I know, the time of creation of the execution environment has been
slightly modified in the streaming API, which caused that
dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism() may
return different values. Usage of the former is recommended.
In theory, the latter is eliminated from the code, but there might be some
more left, hiding. I've recently fixed one in WindowedDataStream. If you
encounter problems with the parallelism, it may be the cause.

Peter
Reply | Threaded
Open this post in threaded view
|

Re: Flink Streaming parallelism bug report

Gyula Fóra-2
They should actually return different values in many cases.

Datastream.env.getDegreeOfParallelism returns the environment parallelism
(default)

Datastream.getparallelism() returns the parallelism of the operator. There
is a reason when one or the other is used.

Please watch out when you try to modify that because you might actually
break functionality there :p
On Feb 27, 2015 8:55 AM, "Szabó Péter" <[hidden email]> wrote:

> As I know, the time of creation of the execution environment has been
> slightly modified in the streaming API, which caused that
> dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism() may
> return different values. Usage of the former is recommended.
> In theory, the latter is eliminated from the code, but there might be some
> more left, hiding. I've recently fixed one in WindowedDataStream. If you
> encounter problems with the parallelism, it may be the cause.
>
> Peter
>
Reply | Threaded
Open this post in threaded view
|

Re: Flink Streaming parallelism bug report

Szabó Péter
Okay, thanks!

In my case, I tried to run an ITCase test and the environment parallelism
is happened to be -1, and an exception was thrown. The other ITCases ran
properly, so I figured, the problem is with the windowing.
Can you check it out for me? (WindowedDataStream, line 348)

Peter

2015-02-27 10:06 GMT+01:00 Gyula Fóra <[hidden email]>:

> They should actually return different values in many cases.
>
> Datastream.env.getDegreeOfParallelism returns the environment parallelism
> (default)
>
> Datastream.getparallelism() returns the parallelism of the operator. There
> is a reason when one or the other is used.
>
> Please watch out when you try to modify that because you might actually
> break functionality there :p
> On Feb 27, 2015 8:55 AM, "Szabó Péter" <[hidden email]> wrote:
>
> > As I know, the time of creation of the execution environment has been
> > slightly modified in the streaming API, which caused that
> > dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism()
> may
> > return different values. Usage of the former is recommended.
> > In theory, the latter is eliminated from the code, but there might be
> some
> > more left, hiding. I've recently fixed one in WindowedDataStream. If you
> > encounter problems with the parallelism, it may be the cause.
> >
> > Peter
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Flink Streaming parallelism bug report

Gyula Fóra-2
I can't look at it at the moment, I am on vacation and don't have my
laptop.
On Feb 27, 2015 9:41 AM, "Szabó Péter" <[hidden email]> wrote:

> Okay, thanks!
>
> In my case, I tried to run an ITCase test and the environment parallelism
> is happened to be -1, and an exception was thrown. The other ITCases ran
> properly, so I figured, the problem is with the windowing.
> Can you check it out for me? (WindowedDataStream, line 348)
>
> Peter
>
> 2015-02-27 10:06 GMT+01:00 Gyula Fóra <[hidden email]>:
>
> > They should actually return different values in many cases.
> >
> > Datastream.env.getDegreeOfParallelism returns the environment parallelism
> > (default)
> >
> > Datastream.getparallelism() returns the parallelism of the operator.
> There
> > is a reason when one or the other is used.
> >
> > Please watch out when you try to modify that because you might actually
> > break functionality there :p
> > On Feb 27, 2015 8:55 AM, "Szabó Péter" <[hidden email]>
> wrote:
> >
> > > As I know, the time of creation of the execution environment has been
> > > slightly modified in the streaming API, which caused that
> > > dataStream.getParallelism() and dataStream.env.getDegreeOfParallelism()
> > may
> > > return different values. Usage of the former is recommended.
> > > In theory, the latter is eliminated from the code, but there might be
> > some
> > > more left, hiding. I've recently fixed one in WindowedDataStream. If
> you
> > > encounter problems with the parallelism, it may be the cause.
> > >
> > > Peter
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Flink Streaming parallelism bug report

Szabó Péter
No problem.
I will not commit the modification until it is clarified.

Peter

2015-02-27 10:48 GMT+01:00 Gyula Fóra <[hidden email]>:

> I can't look at it at the moment, I am on vacation and don't have my
> laptop.
> On Feb 27, 2015 9:41 AM, "Szabó Péter" <[hidden email]> wrote:
>
> > Okay, thanks!
> >
> > In my case, I tried to run an ITCase test and the environment parallelism
> > is happened to be -1, and an exception was thrown. The other ITCases ran
> > properly, so I figured, the problem is with the windowing.
> > Can you check it out for me? (WindowedDataStream, line 348)
> >
> > Peter
> >
> > 2015-02-27 10:06 GMT+01:00 Gyula Fóra <[hidden email]>:
> >
> > > They should actually return different values in many cases.
> > >
> > > Datastream.env.getDegreeOfParallelism returns the environment
> parallelism
> > > (default)
> > >
> > > Datastream.getparallelism() returns the parallelism of the operator.
> > There
> > > is a reason when one or the other is used.
> > >
> > > Please watch out when you try to modify that because you might actually
> > > break functionality there :p
> > > On Feb 27, 2015 8:55 AM, "Szabó Péter" <[hidden email]>
> > wrote:
> > >
> > > > As I know, the time of creation of the execution environment has been
> > > > slightly modified in the streaming API, which caused that
> > > > dataStream.getParallelism() and
> dataStream.env.getDegreeOfParallelism()
> > > may
> > > > return different values. Usage of the former is recommended.
> > > > In theory, the latter is eliminated from the code, but there might be
> > > some
> > > > more left, hiding. I've recently fixed one in WindowedDataStream. If
> > you
> > > > encounter problems with the parallelism, it may be the cause.
> > > >
> > > > Peter
> > > >
> > >
> >
>