Hi devs,
I wanted to bring up a problem that we have in our package structure. As a result of https://issues.apache.org/jira/browse/FLINK-13045 we started advertising importing two packages in the scala API: import org.apache.flink.table.api._ import org.apache.flink.table.api.scala._ The intention was that the first package (org.apache.flink.table.api) would contain all api classes that are required to work with the unified TableEnvironment. Such as TableEnvironment, Table, Session, Slide and expressionDsl. The second package (org.apache.flink.table.api.scala._) would've been an optional package that contain bridging conversions between Table and DataStream/DataSet APIs including the more specific StreamTableEnvironment and BatchTableEnvironment. The part missing in the original plan was to move all expressions implicit conversions to the org.apache.flink.table.api package. Without that step users of pure table program (that do not use the table-api-scala-bridge module) cannot use the Expression DSL. Therefore we should try to move those expressions as soon as possible. The problem with this approach is that it clashes with common imports of classes from java.* and scala.* packages. Users are forced to write: import org.apache.flink.table.api._ import org.apache.flink.table.api.scala_ import _root_.scala.collection.mutable.ArrayBuffer import _root_.java.lang.Integer Besides being cumbersome, it also messes up the macro based type extraction (org.apache.flink.api.scala#createTypeInformation) for all classes from scala.* packages. I don't fully understand the reasons for it, but the createTypeInformation somehow drops the _root_ for WeakTypeTags. So e.g. for a call: createTypeInformation[_root_.scala.collection.mutable.ArrayBuffer] it actually tries to construct a TypeInformation for org.apache.flink.table.api.scala.collection.mutable.ArrayBuffer, which obviously fails. What I would suggest for a target solution is to have: 1. for users of unified Table API with Scala ExpressionDSL import org.apache.flink.table.api._ (for TableEnvironment, Tumble etc. and expressions) 2. for users of Table API with scala's bridging conversions import org.apache.flink.table.api._ (for Tumble etc. and expressions) import org.apache.flink.table.api.bridge.scala._ (for bridging conversions and StreamTableEnvironment) 3. for users of unified Table API with Java ExpressionDSL import org.apache.flink.table.api.* (for TableEnvironment, Tumble etc.) import org.apache.flink.table.api.Expressions.* (for Expression dsl) 4. for users of Table API with java's bridging conversions import org.apache.flink.table.api.* (for Tumble etc.) import org.apache.flink.table.api.Expressions.* (for Expression dsl) import org.apache.flink.table.api.bridge.java.* To have that working we need to: * move the scala expression DSL to org.apache.flink.table.api package in table-api-scala module * move all classes from org.apache.flink.table.api.scala and org.apache.flink.table.api.java packages to org.apache.flink.table.api.bridge.scala and org.apache.flink.table.api.bridge.java accordingly and drop the former packages The biggest question I have is how do we want to perform that transition. If we do it in one go we will break existing user programs that uses classes from org.apache.flink.table.api.java/scala. Those packages were present from day one of Table API. Nevertheless this would be my preffered way to move forward as we annoy users only once, even if one big time :( Different option would be to make that transition gradually in 3 releases. 1. In the first we introduce the org.apache.flink.table.api.bridge.java/scala, and we have StreamTableEnvironment etc. as well as expression DSL in both. We ask users to migrate to the new package. 2. We drop the org.apache.flink.table.api.java/scala and ask users to import additionally org.apache.flink.table.api.* for expressions (this is the same as we did in 1.9.0, the thing though it was extremely hard to do it then) 3. We finally move the expression DSL from org.apache.flink.table.api.bridge.scala to org.apache.flink.table.api What do you think about it? Best, Dawid signature.asc (849 bytes) Download Attachment |
Hi everyone,
thanks for bringing our offline discussion to the mailing list, Dawid. This is a very bad mistake that has been made in the past. In general, we should discourage putting the terms "java" and "scala" in package names as this has side effects on Scala imports. I really don't like forcing users to put a "_root_" in their imports. It also happended to me a couple of times while developing Flink code that I was sitting in front of my IDE wondering why the code doesn't compile. I'm also in favor of peforming this big change as early as possible. I'm sure Table API users are already quite annoyed by all the changes/refactorings happening. Changing the imports twice or three times is even more cumbersome. Having to import just "org.apache.flink.table.api._" is a big usability plus for new users and especially interactive shell/notebook users. Regards, Timo On 13.02.20 14:39, Dawid Wysakowicz wrote: > Hi devs, > > I wanted to bring up a problem that we have in our package structure. > > As a result of https://issues.apache.org/jira/browse/FLINK-13045 we > started advertising importing two packages in the scala API: > import org.apache.flink.table.api._ > import org.apache.flink.table.api.scala._ > > The intention was that the first package (org.apache.flink.table.api) > would contain all api classes that are required to work with the unified > TableEnvironment. Such as TableEnvironment, Table, Session, Slide and > expressionDsl. The second package (org.apache.flink.table.api.scala._) > would've been an optional package that contain bridging conversions > between Table and DataStream/DataSet APIs including the more specific > StreamTableEnvironment and BatchTableEnvironment. > > The part missing in the original plan was to move all expressions > implicit conversions to the org.apache.flink.table.api package. Without > that step users of pure table program (that do not use the > table-api-scala-bridge module) cannot use the Expression DSL. Therefore > we should try to move those expressions as soon as possible. > > The problem with this approach is that it clashes with common imports of > classes from java.* and scala.* packages. Users are forced to write: > > import org.apache.flink.table.api._ > import org.apache.flink.table.api.scala_ > import _root_.scala.collection.mutable.ArrayBuffer > import _root_.java.lang.Integer > > Besides being cumbersome, it also messes up the macro based type > extraction (org.apache.flink.api.scala#createTypeInformation) for all > classes from scala.* packages. I don't fully understand the reasons for > it, but the createTypeInformation somehow drops the _root_ for > WeakTypeTags. So e.g. for a call: > createTypeInformation[_root_.scala.collection.mutable.ArrayBuffer] it > actually tries to construct a TypeInformation for > org.apache.flink.table.api.scala.collection.mutable.ArrayBuffer, which > obviously fails. > > > > What I would suggest for a target solution is to have: > > 1. for users of unified Table API with Scala ExpressionDSL > > import org.apache.flink.table.api._ (for TableEnvironment, Tumble etc. > and expressions) > > 2. for users of Table API with scala's bridging conversions > > import org.apache.flink.table.api._ (for Tumble etc. and expressions) > import org.apache.flink.table.api.bridge.scala._ (for bridging > conversions and StreamTableEnvironment) > > 3. for users of unified Table API with Java ExpressionDSL > > import org.apache.flink.table.api.* (for TableEnvironment, Tumble etc.) > import org.apache.flink.table.api.Expressions.* (for Expression dsl) > > 4. for users of Table API with java's bridging conversions > > import org.apache.flink.table.api.* (for Tumble etc.) > import org.apache.flink.table.api.Expressions.* (for Expression dsl) > import org.apache.flink.table.api.bridge.java.* > > To have that working we need to: > * move the scala expression DSL to org.apache.flink.table.api package in > table-api-scala module > * move all classes from org.apache.flink.table.api.scala and > org.apache.flink.table.api.java packages to > org.apache.flink.table.api.bridge.scala and > org.apache.flink.table.api.bridge.java accordingly and drop the former > packages > > The biggest question I have is how do we want to perform that > transition. If we do it in one go we will break existing user programs > that uses classes from org.apache.flink.table.api.java/scala. Those > packages were present from day one of Table API. Nevertheless this would > be my preffered way to move forward as we annoy users only once, even if > one big time :( > > Different option would be to make that transition gradually in 3 releases. > 1. In the first we introduce the > org.apache.flink.table.api.bridge.java/scala, and we have > StreamTableEnvironment etc. as well as expression DSL in both. We ask > users to migrate to the new package. > 2. We drop the org.apache.flink.table.api.java/scala and ask users to > import additionally org.apache.flink.table.api.* for expressions (this > is the same as we did in 1.9.0, the thing though it was extremely hard > to do it then) > 3. We finally move the expression DSL from > org.apache.flink.table.api.bridge.scala to org.apache.flink.table.api > > What do you think about it? > > Best, > > Dawid > > |
Thanks for bringing this discussion.
+1 to peforming this big change as early as possible. You solved my question, why we need "_root_". Yes, I don't like this import too. And it is very strange that expressionDsl is in api, but can only work in api.scala. (Because scala extends ImplicitExpressionConversions) About transition gradually in 3 releases. Can users rewrite to the right way in one release? Or they must update and update? Best, Jingsong Lee On Fri, Feb 14, 2020 at 12:12 AM Timo Walther <[hidden email]> wrote: > Hi everyone, > > thanks for bringing our offline discussion to the mailing list, Dawid. > This is a very bad mistake that has been made in the past. In general, > we should discourage putting the terms "java" and "scala" in package > names as this has side effects on Scala imports. > > I really don't like forcing users to put a "_root_" in their imports. It > also happended to me a couple of times while developing Flink code that > I was sitting in front of my IDE wondering why the code doesn't compile. > > I'm also in favor of peforming this big change as early as possible. I'm > sure Table API users are already quite annoyed by all the > changes/refactorings happening. Changing the imports twice or three > times is even more cumbersome. > > Having to import just "org.apache.flink.table.api._" is a big usability > plus for new users and especially interactive shell/notebook users. > > Regards, > Timo > > > On 13.02.20 14:39, Dawid Wysakowicz wrote: > > Hi devs, > > > > I wanted to bring up a problem that we have in our package structure. > > > > As a result of https://issues.apache.org/jira/browse/FLINK-13045 we > > started advertising importing two packages in the scala API: > > import org.apache.flink.table.api._ > > import org.apache.flink.table.api.scala._ > > > > The intention was that the first package (org.apache.flink.table.api) > > would contain all api classes that are required to work with the unified > > TableEnvironment. Such as TableEnvironment, Table, Session, Slide and > > expressionDsl. The second package (org.apache.flink.table.api.scala._) > > would've been an optional package that contain bridging conversions > > between Table and DataStream/DataSet APIs including the more specific > > StreamTableEnvironment and BatchTableEnvironment. > > > > The part missing in the original plan was to move all expressions > > implicit conversions to the org.apache.flink.table.api package. Without > > that step users of pure table program (that do not use the > > table-api-scala-bridge module) cannot use the Expression DSL. Therefore > > we should try to move those expressions as soon as possible. > > > > The problem with this approach is that it clashes with common imports of > > classes from java.* and scala.* packages. Users are forced to write: > > > > import org.apache.flink.table.api._ > > import org.apache.flink.table.api.scala_ > > import _root_.scala.collection.mutable.ArrayBuffer > > import _root_.java.lang.Integer > > > > Besides being cumbersome, it also messes up the macro based type > > extraction (org.apache.flink.api.scala#createTypeInformation) for all > > classes from scala.* packages. I don't fully understand the reasons for > > it, but the createTypeInformation somehow drops the _root_ for > > WeakTypeTags. So e.g. for a call: > > createTypeInformation[_root_.scala.collection.mutable.ArrayBuffer] it > > actually tries to construct a TypeInformation for > > org.apache.flink.table.api.scala.collection.mutable.ArrayBuffer, which > > obviously fails. > > > > > > > > What I would suggest for a target solution is to have: > > > > 1. for users of unified Table API with Scala ExpressionDSL > > > > import org.apache.flink.table.api._ (for TableEnvironment, Tumble etc. > > and expressions) > > > > 2. for users of Table API with scala's bridging conversions > > > > import org.apache.flink.table.api._ (for Tumble etc. and expressions) > > import org.apache.flink.table.api.bridge.scala._ (for bridging > > conversions and StreamTableEnvironment) > > > > 3. for users of unified Table API with Java ExpressionDSL > > > > import org.apache.flink.table.api.* (for TableEnvironment, Tumble etc.) > > import org.apache.flink.table.api.Expressions.* (for Expression dsl) > > > > 4. for users of Table API with java's bridging conversions > > > > import org.apache.flink.table.api.* (for Tumble etc.) > > import org.apache.flink.table.api.Expressions.* (for Expression dsl) > > import org.apache.flink.table.api.bridge.java.* > > > > To have that working we need to: > > * move the scala expression DSL to org.apache.flink.table.api package in > > table-api-scala module > > * move all classes from org.apache.flink.table.api.scala and > > org.apache.flink.table.api.java packages to > > org.apache.flink.table.api.bridge.scala and > > org.apache.flink.table.api.bridge.java accordingly and drop the former > > packages > > > > The biggest question I have is how do we want to perform that > > transition. If we do it in one go we will break existing user programs > > that uses classes from org.apache.flink.table.api.java/scala. Those > > packages were present from day one of Table API. Nevertheless this would > > be my preffered way to move forward as we annoy users only once, even if > > one big time :( > > > > Different option would be to make that transition gradually in 3 > releases. > > 1. In the first we introduce the > > org.apache.flink.table.api.bridge.java/scala, and we have > > StreamTableEnvironment etc. as well as expression DSL in both. We ask > > users to migrate to the new package. > > 2. We drop the org.apache.flink.table.api.java/scala and ask users to > > import additionally org.apache.flink.table.api.* for expressions (this > > is the same as we did in 1.9.0, the thing though it was extremely hard > > to do it then) > > 3. We finally move the expression DSL from > > org.apache.flink.table.api.bridge.scala to org.apache.flink.table.api > > > > What do you think about it? > > > > Best, > > > > Dawid > > > > > > -- Best, Jingsong Lee |
Free forum by Nabble | Edit this page |