[
https://issues.apache.org/jira/browse/FLINK-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Metzger resolved FLINK-801.
----------------------------------
Resolution: Invalid
There is a new pull request for that:
https://github.com/apache/incubator-flink/pull/4> Serialized String comparison, Unicode support
> ---------------------------------------------
>
> Key: FLINK-801
> URL:
https://issues.apache.org/jira/browse/FLINK-801> Project: Flink
> Issue Type: Bug
> Reporter: GitHub Import
> Labels: github-import
> Fix For: pre-apache
>
> Attachments: pull-request-801-3431874524946732791.patch
>
>
> The StringComparator now works on serialized data.
> To this end new string read/write/copy/compare methods were introduced, which use a variable-length encoding for the characters.
> key-points:
> - The most significant bits are written/read first.
> - The first 2 bits of the character are used to encode the size of the character.
> - A character is at most 3 Bytes big.
> Additionally, the StringSerializer now has full unicode support. i couldn't find a unicode character that uses more than 22 bits, as such 3 Bytes should be sufficient.
> ---------------- Imported from GitHub ----------------
> Url:
https://github.com/stratosphere/stratosphere/pull/801> Created by: [zentol|
https://github.com/zentol]
> Labels:
> Created at: Tue May 13 18:06:22 CEST 2014
> State: open
--
This message was sent by Atlassian JIRA
(v6.2#6252)