Data overflow in SpillingResettableMutableObjectIterator

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Data overflow in SpillingResettableMutableObjectIterator

Jian Cao
Hi all:
We are using flink's iteration,and find the
SpillingResettableMutableObjectIterator has a data overflow problem if
the number of elements in a single input exceeds Integer.MAX_VALUE.

The reason is inside the SpillingResettableMutableObjectIterator, it
track the total number of elements and the number of elements
currently read with two int type fileds (elementCount and
currentElementNum), and if the number of elements exceeds
Integer.MAX_VALUE, it will overflow.

If there is an overflow, then in the next iteration, after reset the
input , the data will not be read or only part of the data will be
read.

Therefore, I suggest changing the type of these two fields of
SpillingResettableMutableObjectIterator
from int to long.

Best regards.
Reply | Threaded
Open this post in threaded view
|

Re: Data overflow in SpillingResettableMutableObjectIterator

Piotr Nowojski-3
Hi Jian,

Thank your for reporting the issue. I see that you have already created a ticket for this [1].

Piotrek

[1] https://issues.apache.org/jira/browse/FLINK-15549 <https://issues.apache.org/jira/browse/FLINK-15549>


> On 9 Jan 2020, at 09:10, Jian Cao <[hidden email]> wrote:
>
> Hi all:
> We are using flink's iteration,and find the SpillingResettableMutableObjectIterator has a data overflow problem if the number of elements in a single input exceeds Integer.MAX_VALUE.
>
> The reason is inside the SpillingResettableMutableObjectIterator, it track the total number of elements and the number of elements currently read with two int type fileds (elementCount and currentElementNum), and if the number of elements exceeds Integer.MAX_VALUE, it will overflow.
>
> If there is an overflow, then in the next iteration, after reset the input , the data will not be read or only part of the data will be read.
>
> Therefore, I suggest changing the type of these two fields of SpillingResettableMutableObjectIterator from int to long.
>
> Best regards.