[jira] [Created] (FLINK-1474) Cross produces wrong results for non-tiny inputs

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-1474) Cross produces wrong results for non-tiny inputs

Shang Yuanchun (Jira)
Fabian Hueske created FLINK-1474:
------------------------------------

             Summary: Cross produces wrong results for non-tiny inputs
                 Key: FLINK-1474
                 URL: https://issues.apache.org/jira/browse/FLINK-1474
             Project: Flink
          Issue Type: Bug
          Components: Local Runtime
    Affects Versions: 0.8
            Reporter: Fabian Hueske
            Assignee: Fabian Hueske
            Priority: Critical
             Fix For: 0.9, 0.8.1


The cross operator produces wrong results for larger inputs.
It appears that the number of emitted records is correct, but the values are not.
There seems to be some cut of at 128 elements.

To reproduce add the following test to {CrossITCase} and execute it:

{code}
        @Test
        public void testCorrectnessOfCrossOfLargeInputs() throws Exception {
                /*
                 * check correctness of cross with larger inputs
                 */

                final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

                DataSet<Long> ds = env.generateSequence(0, 1000);
                DataSet<Long> ds2 = env.generateSequence(0, 1000);
                DataSet<Tuple> crossDs = ds.cross(ds2)
                                .filter(new FilterFunction<Tuple2<Long, Long>>() {
                                        @Override
                                        public boolean filter(Tuple2<Long, Long> value) throws Exception {
                                                return value.f0 == value.f1;
                                        }
                                })
                                .project(0).sum(0);

                crossDs.writeAsCsv(resultPath);
                env.execute();

                expected = "(500500)\n"; // 500500 = 0+1+2+...+999+1000
        }
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)