Hi,
What I am talking about is the `PlannerExpressionParserImpl`, which is written by Scala Parser tool, Every time we call StreamTableEnvironment#FromDataStream, the field String (or maybe scala.Symbol by scala Api) shall be parsed by `PlannerExpressionParserImpl ` into `Expression`. As we can see the parser grammar written in `PlannerExpressionParserImpl `, the `fieldRefrence` is defined by `*` or `ident`. `ident` in `PlannerExpressionParserImpl` is just the one in [[scala.util.parsing.combinator.JavaTokenParsers]] which is JavaIdentifier. After discussed with Jark(云邪), I also discovered that `PlannerExpressionParserImpl` currrently even does not support quote ('`'). I did't know what u just told me about Calcite before. But it doesn't matter. Well maybe we can just let PlannerExpressionParserImpl#FieldRefrence use Unicode as its default charset and support '`' for the first step, and then make the whole project supports Unicode charset when Calcite related part is available. btw I have been to ur lecture in FFA Asia on Calcite, which really inspired me a lot~ Best Regards 刘首维Shoi Liu 大连理工大学 ------------------ 原始邮件 ------------------ 发件人: "Danny Chan"<[hidden email]>; 发送时间: 2020年1月16日(星期四) 中午12:45 收件人: "刘首维"<[hidden email]>; 主题: Re: Let Flink SQL PlannerExpressionParserImpl#FieldRefrence use Unicode as its default charset User defined charset for DB/session/table/column is not supported yet for Flink now, specifically, Flink use Calcite as the panner engine that also does not support configurable charset well, there is a design doc [1] but has never been implemented. Apache Calcite’s default system charset is “ISO-8859-1”. Actually I’m a little confused about your description, do you mean the charset of SqlIdentifier or the string literal ? They are different topics. [1] https://docs.google.com/document/d/1wo5byn_6K_YOKiPdXNav1zgzt9IBC3SbPvpPnIShtXk/edit#heading=h.g4bnumde4dl5 Best, Danny Chan 在 2020年1月15日 +0800 PM11:08,刘首维 <[hidden email]>,写道: Hi all, the related issue:https://issues.apache.org/jira/browse/FLINK-15573 As the title tells, what I do want to do is let the `FieldRefrence` use Unicode as its default charset (or maybe as an optional charset which can be configured). According to the `PlannerExpressionParserImpl`, currently FLINK uses JavaIdentifier as `FieldRefrence`‘s default charset. But, from my perspective, it is not enough. Considering that user who uses ElasticSearch as sink,we all know that ES has A field called `@timestamp`, which JavaIdentifier cannot meet. So in my team, we just let `PlannerExpressionParserImpl#FieldRefrence` use Unicode as its default charset so that solves this kind of problem. (Plz refer to the issue I mentioned above ) In my Opinion, the change shall be for general purpose: Firstly, Mysql supports unicode as default field charset, see the field named `@@`, so shall we support unicode also? <[hidden email]> What’ s more, my team really get a lot of benefits from this change. I also believe that it can give other users more benefits without even any harm! Fortunately, the change supports fully forwards compatibility.Cuz Unicode is the superset of JavaIdentifier. Only a few code change can achieve this goal. Looking forward for any opinion. btw, thanks to tison~ Best Regards 刘首维 Shoi Liu |
Free forum by Nabble | Edit this page |