[jira] [Created] (FLINK-6015) "Row too short" when reading CSV line with empty last field (i.e. ending in comma)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-6015) "Row too short" when reading CSV line with empty last field (i.e. ending in comma)

Shang Yuanchun (Jira)
Luke Hutchison created FLINK-6015:
-------------------------------------

             Summary: "Row too short" when reading CSV line with empty last field (i.e. ending in comma)
                 Key: FLINK-6015
                 URL: https://issues.apache.org/jira/browse/FLINK-6015
             Project: Flink
          Issue Type: Bug
          Components: Batch Connectors and Input/Output Formats
    Affects Versions: 1.2.0
         Environment: Linux
            Reporter: Luke Hutchison


When using env.readCsvFile(filename), if a line in the CSV file has an empty last field, the line ends with a comma. This triggers an exception in GenericCsvInput.parseRecord():

                        // check valid start position
                        if (startPos >= limit) {
                                if (lenient) {
                                        return false;
                                } else {
                                        throw new ParseException("Row too short: " + new String(bytes, offset, numBytes));
                                }
                        }

Setting the parser to lenient would cause the last field to be left as null, rather than setting its value to "".

The parser should accept empty values for the last field on a row.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)