[jira] [Created] (FLINK-1085) Unnecessary failing of GroupReduceCombineDriver

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-1085) Unnecessary failing of GroupReduceCombineDriver

Shang Yuanchun (Jira)
Fabian Hueske created FLINK-1085:
------------------------------------

             Summary: Unnecessary failing of GroupReduceCombineDriver
                 Key: FLINK-1085
                 URL: https://issues.apache.org/jira/browse/FLINK-1085
             Project: Flink
          Issue Type: Bug
          Components: Local Runtime
    Affects Versions: 0.7-incubating, 0.6.1-incubating
            Reporter: Fabian Hueske


With a recent update (commit cbbcf7820885a8a9734ffeba637b0182a6637939) the GroupReduceCombineDriver was changed to not use an asynchronous partial sorter. Instead, the driver fills a sort buffer with records, sorts it, combines them, clears the buffer, and continues to fill it again.

The GroupReduceCombineDriver fails if a record cannot be serialized into an empty sort buffer, i.e., if the record is too large for the buffer.

Alternatively, we should emit a WARN message for the first record that is too large and just forward all records which do not fit into the empty sort buffer (maybe continue to count how many records were simply forwarded and give a second WARN message with this statistic).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)