[DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jane Chan
Hi everyone,

I would like to start a discussion on FLINK-21045 [1] about supporting
`LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed by
FLIP-68 [2] as following.

-- load a module with the given name and append it to the end of the module
list
LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]

--unload a module by name from the module list and other modules remain in
the same relative positions
UNLOAD MODULE 'name'

After a round of discussion on the Jira ticket, it seems some unanswered
questions need more opinions and suggestions.

1. The way to redefine resolution order easily

    Rui Li suggested introducing `USE MODULES` and adding similar
functionality to the API because

>  1) It's very tedious to unload old modules just to reorder them.

 2) Users may not even know how to "re-load" an old module if it was not
> initially loaded by the user, e.g. don't know which type to use.


    Jane Chan wondered that module is not like the catalog which has a
concept of namespace could specify, and `USE` sounds like a
mutual-exclusive concept.
    Maybe `RELOAD MODULES` can express upgrading the priority of the loaded
module(s).


2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
    Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP MODULE` instead
of `LOAD/UNLOAD MODULE` because

>  1) From a pure SQL user's perspective, maybe `CREATE MODULE + USE MODULE`
> is easier to use rather than `LOAD/UNLOAD`.
>  2) This will be very similar to what the catalog used now.


  Timo Walther would rather stick to the agreed design because
loading/unloading modules is a concept known from kernels etc.

3. Simplify the module design by mapping modules purely by name

LOAD MODULE geo_utils
LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated 'type='/'module='
but allow only 1 module to be loaded parameterized
UNLOAD hive
USE MODULES hive, core


Please find more details in the reference link. Looking forward to your
feedback.

[1] https://issues.apache.org/jira/browse/FLINK-21045#
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules>
[2]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules

Best,
Jane
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jark Wu-2
Thanks Jane for the summary and starting the discussion in the mailing
list.

Here are my thoughts:

1) syntax to reorder modules
I agree with Rui Li it would be quite useful if we can have some syntax to
reorder modules.
I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y, z`,
because USE has a more sense of effective and specifying ordering, than
RELOAD.
From my feeling, RELOAD just means we unregister and register x,y,z modules
again,
it sounds like other registered modules are still in use and in the order.

3) mapping modules purely by name
This can definitely improve the usability of loading modules, because
the 'type=' property
looks really redundant. We can think of this as a syntax sugar that the
default type value is the module name.
And we can support to specify 'type=' property in the future to allow
multiple modules for one module type.

Besides, I would like to mention one more change, that the module name
proposed in FLIP-68 is a string literal.
But I think we are all on the same page to change it into a simple
(non-compound) identifier.

LOAD/UNLOAD MODULE 'core'
==>
LOAD/UNLOAD MODULE core


Best,
Jark


On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]> wrote:

> Hi everyone,
>
> I would like to start a discussion on FLINK-21045 [1] about supporting
> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed by
> FLIP-68 [2] as following.
>
> -- load a module with the given name and append it to the end of the module
> list
> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
>
> --unload a module by name from the module list and other modules remain in
> the same relative positions
> UNLOAD MODULE 'name'
>
> After a round of discussion on the Jira ticket, it seems some unanswered
> questions need more opinions and suggestions.
>
> 1. The way to redefine resolution order easily
>
>     Rui Li suggested introducing `USE MODULES` and adding similar
> functionality to the API because
>
> >  1) It's very tedious to unload old modules just to reorder them.
>
>  2) Users may not even know how to "re-load" an old module if it was not
> > initially loaded by the user, e.g. don't know which type to use.
>
>
>     Jane Chan wondered that module is not like the catalog which has a
> concept of namespace could specify, and `USE` sounds like a
> mutual-exclusive concept.
>     Maybe `RELOAD MODULES` can express upgrading the priority of the loaded
> module(s).
>
>
> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
>     Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP MODULE` instead
> of `LOAD/UNLOAD MODULE` because
>
> >  1) From a pure SQL user's perspective, maybe `CREATE MODULE + USE
> MODULE`
> > is easier to use rather than `LOAD/UNLOAD`.
> >  2) This will be very similar to what the catalog used now.
>
>
>   Timo Walther would rather stick to the agreed design because
> loading/unloading modules is a concept known from kernels etc.
>
> 3. Simplify the module design by mapping modules purely by name
>
> LOAD MODULE geo_utils
> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated 'type='/'module='
> but allow only 1 module to be loaded parameterized
> UNLOAD hive
> USE MODULES hive, core
>
>
> Please find more details in the reference link. Looking forward to your
> feedback.
>
> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >
> [2]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>
> Best,
> Jane
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Rui Li
Thanks Jane for starting the discussion.

Regarding #1, I also prefer `USE MODULES` syntax. It can be interpreted as
"setting the current order of modules", which is similar to "setting the
current catalog" for `USE CATALOG`.

Regarding #3, I'm fine to map modules purely by name because I think it
satisfies all the use cases we have at hand. But I guess we need to make
sure we're backward compatible, i.e. users don't need to change their yaml
files to configure the modules.

On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:

> Thanks Jane for the summary and starting the discussion in the mailing
> list.
>
> Here are my thoughts:
>
> 1) syntax to reorder modules
> I agree with Rui Li it would be quite useful if we can have some syntax to
> reorder modules.
> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y, z`,
> because USE has a more sense of effective and specifying ordering, than
> RELOAD.
> From my feeling, RELOAD just means we unregister and register x,y,z modules
> again,
> it sounds like other registered modules are still in use and in the order.
>
> 3) mapping modules purely by name
> This can definitely improve the usability of loading modules, because
> the 'type=' property
> looks really redundant. We can think of this as a syntax sugar that the
> default type value is the module name.
> And we can support to specify 'type=' property in the future to allow
> multiple modules for one module type.
>
> Besides, I would like to mention one more change, that the module name
> proposed in FLIP-68 is a string literal.
> But I think we are all on the same page to change it into a simple
> (non-compound) identifier.
>
> LOAD/UNLOAD MODULE 'core'
> ==>
> LOAD/UNLOAD MODULE core
>
>
> Best,
> Jark
>
>
> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]> wrote:
>
> > Hi everyone,
> >
> > I would like to start a discussion on FLINK-21045 [1] about supporting
> > `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed by
> > FLIP-68 [2] as following.
> >
> > -- load a module with the given name and append it to the end of the
> module
> > list
> > LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
> >
> > --unload a module by name from the module list and other modules remain
> in
> > the same relative positions
> > UNLOAD MODULE 'name'
> >
> > After a round of discussion on the Jira ticket, it seems some unanswered
> > questions need more opinions and suggestions.
> >
> > 1. The way to redefine resolution order easily
> >
> >     Rui Li suggested introducing `USE MODULES` and adding similar
> > functionality to the API because
> >
> > >  1) It's very tedious to unload old modules just to reorder them.
> >
> >  2) Users may not even know how to "re-load" an old module if it was not
> > > initially loaded by the user, e.g. don't know which type to use.
> >
> >
> >     Jane Chan wondered that module is not like the catalog which has a
> > concept of namespace could specify, and `USE` sounds like a
> > mutual-exclusive concept.
> >     Maybe `RELOAD MODULES` can express upgrading the priority of the
> loaded
> > module(s).
> >
> >
> > 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> >     Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP MODULE`
> instead
> > of `LOAD/UNLOAD MODULE` because
> >
> > >  1) From a pure SQL user's perspective, maybe `CREATE MODULE + USE
> > MODULE`
> > > is easier to use rather than `LOAD/UNLOAD`.
> > >  2) This will be very similar to what the catalog used now.
> >
> >
> >   Timo Walther would rather stick to the agreed design because
> > loading/unloading modules is a concept known from kernels etc.
> >
> > 3. Simplify the module design by mapping modules purely by name
> >
> > LOAD MODULE geo_utils
> > LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> 'type='/'module='
> > but allow only 1 module to be loaded parameterized
> > UNLOAD hive
> > USE MODULES hive, core
> >
> >
> > Please find more details in the reference link. Looking forward to your
> > feedback.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-21045#
> > <
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > >
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >
> > Best,
> > Jane
> >
>


--
Best regards!
Rui Li
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Timo Walther-2
Thanks for starting the discussion Jane.

I'm fine with using `USE` for reordering the modules.

I agree with Jark to not use a string literal for the module name but an
identifer.

However, to simplify the design I would completely remove the `type=`
property because having multiple ways of defining the same thing might
be confusing without providing additional benefits. I also think that
users should not be able to load the same module multiple times.

Regarding Rui's comment, the YAML file should not be affected by this
change and we can leave this part of the API untouched. We need to
update the `ModuleFactory` anyways because it still uses the deprecated
`TableFactory` class.

Regards,
Timo

On 01.02.21 09:18, Rui Li wrote:

> Thanks Jane for starting the discussion.
>
> Regarding #1, I also prefer `USE MODULES` syntax. It can be interpreted as
> "setting the current order of modules", which is similar to "setting the
> current catalog" for `USE CATALOG`.
>
> Regarding #3, I'm fine to map modules purely by name because I think it
> satisfies all the use cases we have at hand. But I guess we need to make
> sure we're backward compatible, i.e. users don't need to change their yaml
> files to configure the modules.
>
> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
>
>> Thanks Jane for the summary and starting the discussion in the mailing
>> list.
>>
>> Here are my thoughts:
>>
>> 1) syntax to reorder modules
>> I agree with Rui Li it would be quite useful if we can have some syntax to
>> reorder modules.
>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y, z`,
>> because USE has a more sense of effective and specifying ordering, than
>> RELOAD.
>>  From my feeling, RELOAD just means we unregister and register x,y,z modules
>> again,
>> it sounds like other registered modules are still in use and in the order.
>>
>> 3) mapping modules purely by name
>> This can definitely improve the usability of loading modules, because
>> the 'type=' property
>> looks really redundant. We can think of this as a syntax sugar that the
>> default type value is the module name.
>> And we can support to specify 'type=' property in the future to allow
>> multiple modules for one module type.
>>
>> Besides, I would like to mention one more change, that the module name
>> proposed in FLIP-68 is a string literal.
>> But I think we are all on the same page to change it into a simple
>> (non-compound) identifier.
>>
>> LOAD/UNLOAD MODULE 'core'
>> ==>
>> LOAD/UNLOAD MODULE core
>>
>>
>> Best,
>> Jark
>>
>>
>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]> wrote:
>>
>>> Hi everyone,
>>>
>>> I would like to start a discussion on FLINK-21045 [1] about supporting
>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed by
>>> FLIP-68 [2] as following.
>>>
>>> -- load a module with the given name and append it to the end of the
>> module
>>> list
>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
>>>
>>> --unload a module by name from the module list and other modules remain
>> in
>>> the same relative positions
>>> UNLOAD MODULE 'name'
>>>
>>> After a round of discussion on the Jira ticket, it seems some unanswered
>>> questions need more opinions and suggestions.
>>>
>>> 1. The way to redefine resolution order easily
>>>
>>>      Rui Li suggested introducing `USE MODULES` and adding similar
>>> functionality to the API because
>>>
>>>>   1) It's very tedious to unload old modules just to reorder them.
>>>
>>>   2) Users may not even know how to "re-load" an old module if it was not
>>>> initially loaded by the user, e.g. don't know which type to use.
>>>
>>>
>>>      Jane Chan wondered that module is not like the catalog which has a
>>> concept of namespace could specify, and `USE` sounds like a
>>> mutual-exclusive concept.
>>>      Maybe `RELOAD MODULES` can express upgrading the priority of the
>> loaded
>>> module(s).
>>>
>>>
>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
>>>      Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP MODULE`
>> instead
>>> of `LOAD/UNLOAD MODULE` because
>>>
>>>>   1) From a pure SQL user's perspective, maybe `CREATE MODULE + USE
>>> MODULE`
>>>> is easier to use rather than `LOAD/UNLOAD`.
>>>>   2) This will be very similar to what the catalog used now.
>>>
>>>
>>>    Timo Walther would rather stick to the agreed design because
>>> loading/unloading modules is a concept known from kernels etc.
>>>
>>> 3. Simplify the module design by mapping modules purely by name
>>>
>>> LOAD MODULE geo_utils
>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
>> 'type='/'module='
>>> but allow only 1 module to be loaded parameterized
>>> UNLOAD hive
>>> USE MODULES hive, core
>>>
>>>
>>> Please find more details in the reference link. Looking forward to your
>>> feedback.
>>>
>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
>>> <
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>
>>> [2]
>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>
>>> Best,
>>> Jane
>>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jane Chan
In reply to this post by Rui Li
Hi Jark and Rui,

Thanks for the discussions.

Regarding #1, I'm fine with `USE MODULES` syntax, and

> It can be interpreted as "setting the current order of modules", which is
> similar to "setting the current catalog" for `USE CATALOG`.
>
I would like to confirm that the unmentioned modules remain in the same
relative order? E.g., if there are three loaded modules `X`, `Y`, `Z`, then
`USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.

Regarding #3, I'm fine with mapping modules purely by name, and I think
Jark raised a good point on making the module name a simple identifier
instead of a string literal. For backward compatibility, since we haven't
supported this syntax yet, the affected users are those who defined modules
in the YAML configuration file. Maybe we can eliminate the 'type' from the
'requiredContext' to make it optional. Thus the proposed mapping mechanism
could use the module name to lookup the suitable factory,  and in the
meanwhile updating documentation to encourage users to simplify their YAML
configuration. And in the long run, we can deprecate the 'type'.

Best,
Jane

On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]> wrote:

> Thanks Jane for starting the discussion.
>
> Regarding #1, I also prefer `USE MODULES` syntax. It can be interpreted as
> "setting the current order of modules", which is similar to "setting the
> current catalog" for `USE CATALOG`.
>
> Regarding #3, I'm fine to map modules purely by name because I think it
> satisfies all the use cases we have at hand. But I guess we need to make
> sure we're backward compatible, i.e. users don't need to change their yaml
> files to configure the modules.
>
> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
>
> > Thanks Jane for the summary and starting the discussion in the mailing
> > list.
> >
> > Here are my thoughts:
> >
> > 1) syntax to reorder modules
> > I agree with Rui Li it would be quite useful if we can have some syntax
> to
> > reorder modules.
> > I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y, z`,
> > because USE has a more sense of effective and specifying ordering, than
> > RELOAD.
> > From my feeling, RELOAD just means we unregister and register x,y,z
> modules
> > again,
> > it sounds like other registered modules are still in use and in the
> order.
> >
> > 3) mapping modules purely by name
> > This can definitely improve the usability of loading modules, because
> > the 'type=' property
> > looks really redundant. We can think of this as a syntax sugar that the
> > default type value is the module name.
> > And we can support to specify 'type=' property in the future to allow
> > multiple modules for one module type.
> >
> > Besides, I would like to mention one more change, that the module name
> > proposed in FLIP-68 is a string literal.
> > But I think we are all on the same page to change it into a simple
> > (non-compound) identifier.
> >
> > LOAD/UNLOAD MODULE 'core'
> > ==>
> > LOAD/UNLOAD MODULE core
> >
> >
> > Best,
> > Jark
> >
> >
> > On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]> wrote:
> >
> > > Hi everyone,
> > >
> > > I would like to start a discussion on FLINK-21045 [1] about supporting
> > > `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed by
> > > FLIP-68 [2] as following.
> > >
> > > -- load a module with the given name and append it to the end of the
> > module
> > > list
> > > LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
> > >
> > > --unload a module by name from the module list and other modules remain
> > in
> > > the same relative positions
> > > UNLOAD MODULE 'name'
> > >
> > > After a round of discussion on the Jira ticket, it seems some
> unanswered
> > > questions need more opinions and suggestions.
> > >
> > > 1. The way to redefine resolution order easily
> > >
> > >     Rui Li suggested introducing `USE MODULES` and adding similar
> > > functionality to the API because
> > >
> > > >  1) It's very tedious to unload old modules just to reorder them.
> > >
> > >  2) Users may not even know how to "re-load" an old module if it was
> not
> > > > initially loaded by the user, e.g. don't know which type to use.
> > >
> > >
> > >     Jane Chan wondered that module is not like the catalog which has a
> > > concept of namespace could specify, and `USE` sounds like a
> > > mutual-exclusive concept.
> > >     Maybe `RELOAD MODULES` can express upgrading the priority of the
> > loaded
> > > module(s).
> > >
> > >
> > > 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> > >     Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP MODULE`
> > instead
> > > of `LOAD/UNLOAD MODULE` because
> > >
> > > >  1) From a pure SQL user's perspective, maybe `CREATE MODULE + USE
> > > MODULE`
> > > > is easier to use rather than `LOAD/UNLOAD`.
> > > >  2) This will be very similar to what the catalog used now.
> > >
> > >
> > >   Timo Walther would rather stick to the agreed design because
> > > loading/unloading modules is a concept known from kernels etc.
> > >
> > > 3. Simplify the module design by mapping modules purely by name
> > >
> > > LOAD MODULE geo_utils
> > > LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> > 'type='/'module='
> > > but allow only 1 module to be loaded parameterized
> > > UNLOAD hive
> > > USE MODULES hive, core
> > >
> > >
> > > Please find more details in the reference link. Looking forward to your
> > > feedback.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-21045#
> > > <
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > > >
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > >
> > > Best,
> > > Jane
> > >
> >
>
>
> --
> Best regards!
> Rui Li
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Timo Walther-2
IMHO I would rather unload the not mentioned modules. The statement
expresses `USE` that implicilty implies that the other modules are "not
used". What do others think?

Regards,
Timo


On 01.02.21 11:28, Jane Chan wrote:

> Hi Jark and Rui,
>
> Thanks for the discussions.
>
> Regarding #1, I'm fine with `USE MODULES` syntax, and
>
>> It can be interpreted as "setting the current order of modules", which is
>> similar to "setting the current catalog" for `USE CATALOG`.
>>
> I would like to confirm that the unmentioned modules remain in the same
> relative order? E.g., if there are three loaded modules `X`, `Y`, `Z`, then
> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
>
> Regarding #3, I'm fine with mapping modules purely by name, and I think
> Jark raised a good point on making the module name a simple identifier
> instead of a string literal. For backward compatibility, since we haven't
> supported this syntax yet, the affected users are those who defined modules
> in the YAML configuration file. Maybe we can eliminate the 'type' from the
> 'requiredContext' to make it optional. Thus the proposed mapping mechanism
> could use the module name to lookup the suitable factory,  and in the
> meanwhile updating documentation to encourage users to simplify their YAML
> configuration. And in the long run, we can deprecate the 'type'.
>
> Best,
> Jane
>
> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]> wrote:
>
>> Thanks Jane for starting the discussion.
>>
>> Regarding #1, I also prefer `USE MODULES` syntax. It can be interpreted as
>> "setting the current order of modules", which is similar to "setting the
>> current catalog" for `USE CATALOG`.
>>
>> Regarding #3, I'm fine to map modules purely by name because I think it
>> satisfies all the use cases we have at hand. But I guess we need to make
>> sure we're backward compatible, i.e. users don't need to change their yaml
>> files to configure the modules.
>>
>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
>>
>>> Thanks Jane for the summary and starting the discussion in the mailing
>>> list.
>>>
>>> Here are my thoughts:
>>>
>>> 1) syntax to reorder modules
>>> I agree with Rui Li it would be quite useful if we can have some syntax
>> to
>>> reorder modules.
>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y, z`,
>>> because USE has a more sense of effective and specifying ordering, than
>>> RELOAD.
>>>  From my feeling, RELOAD just means we unregister and register x,y,z
>> modules
>>> again,
>>> it sounds like other registered modules are still in use and in the
>> order.
>>>
>>> 3) mapping modules purely by name
>>> This can definitely improve the usability of loading modules, because
>>> the 'type=' property
>>> looks really redundant. We can think of this as a syntax sugar that the
>>> default type value is the module name.
>>> And we can support to specify 'type=' property in the future to allow
>>> multiple modules for one module type.
>>>
>>> Besides, I would like to mention one more change, that the module name
>>> proposed in FLIP-68 is a string literal.
>>> But I think we are all on the same page to change it into a simple
>>> (non-compound) identifier.
>>>
>>> LOAD/UNLOAD MODULE 'core'
>>> ==>
>>> LOAD/UNLOAD MODULE core
>>>
>>>
>>> Best,
>>> Jark
>>>
>>>
>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I would like to start a discussion on FLINK-21045 [1] about supporting
>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed by
>>>> FLIP-68 [2] as following.
>>>>
>>>> -- load a module with the given name and append it to the end of the
>>> module
>>>> list
>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
>>>>
>>>> --unload a module by name from the module list and other modules remain
>>> in
>>>> the same relative positions
>>>> UNLOAD MODULE 'name'
>>>>
>>>> After a round of discussion on the Jira ticket, it seems some
>> unanswered
>>>> questions need more opinions and suggestions.
>>>>
>>>> 1. The way to redefine resolution order easily
>>>>
>>>>      Rui Li suggested introducing `USE MODULES` and adding similar
>>>> functionality to the API because
>>>>
>>>>>   1) It's very tedious to unload old modules just to reorder them.
>>>>
>>>>   2) Users may not even know how to "re-load" an old module if it was
>> not
>>>>> initially loaded by the user, e.g. don't know which type to use.
>>>>
>>>>
>>>>      Jane Chan wondered that module is not like the catalog which has a
>>>> concept of namespace could specify, and `USE` sounds like a
>>>> mutual-exclusive concept.
>>>>      Maybe `RELOAD MODULES` can express upgrading the priority of the
>>> loaded
>>>> module(s).
>>>>
>>>>
>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
>>>>      Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP MODULE`
>>> instead
>>>> of `LOAD/UNLOAD MODULE` because
>>>>
>>>>>   1) From a pure SQL user's perspective, maybe `CREATE MODULE + USE
>>>> MODULE`
>>>>> is easier to use rather than `LOAD/UNLOAD`.
>>>>>   2) This will be very similar to what the catalog used now.
>>>>
>>>>
>>>>    Timo Walther would rather stick to the agreed design because
>>>> loading/unloading modules is a concept known from kernels etc.
>>>>
>>>> 3. Simplify the module design by mapping modules purely by name
>>>>
>>>> LOAD MODULE geo_utils
>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
>>> 'type='/'module='
>>>> but allow only 1 module to be loaded parameterized
>>>> UNLOAD hive
>>>> USE MODULES hive, core
>>>>
>>>>
>>>> Please find more details in the reference link. Looking forward to your
>>>> feedback.
>>>>
>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
>>>> <
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>
>>>> [2]
>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>
>>>> Best,
>>>> Jane
>>>>
>>>
>>
>>
>> --
>> Best regards!
>> Rui Li
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jane Chan
Hi Timo, thanks for the discussion.

It seems to reach an agreement regarding #3 that <1> Module name should
better be a simple identifier rather than a string literal. <2> Property
`type` is redundant and should be removed, and mapping will rely on the
module name because loading a module multiple times just using a different
module name doesn't make much sense. <3> We should migrate to the newer API
rather than the deprecated `TableFactory` class.

Regarding #1, I think the point lies in whether changing the resolution
order implies an `unload` operation explicitly (i.e., users could sense
it). What do others think?

Best,
Jane

On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]> wrote:

> IMHO I would rather unload the not mentioned modules. The statement
> expresses `USE` that implicilty implies that the other modules are "not
> used". What do others think?
>
> Regards,
> Timo
>
>
> On 01.02.21 11:28, Jane Chan wrote:
> > Hi Jark and Rui,
> >
> > Thanks for the discussions.
> >
> > Regarding #1, I'm fine with `USE MODULES` syntax, and
> >
> >> It can be interpreted as "setting the current order of modules", which
> is
> >> similar to "setting the current catalog" for `USE CATALOG`.
> >>
> > I would like to confirm that the unmentioned modules remain in the same
> > relative order? E.g., if there are three loaded modules `X`, `Y`, `Z`,
> then
> > `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> >
> > Regarding #3, I'm fine with mapping modules purely by name, and I think
> > Jark raised a good point on making the module name a simple identifier
> > instead of a string literal. For backward compatibility, since we haven't
> > supported this syntax yet, the affected users are those who defined
> modules
> > in the YAML configuration file. Maybe we can eliminate the 'type' from
> the
> > 'requiredContext' to make it optional. Thus the proposed mapping
> mechanism
> > could use the module name to lookup the suitable factory,  and in the
> > meanwhile updating documentation to encourage users to simplify their
> YAML
> > configuration. And in the long run, we can deprecate the 'type'.
> >
> > Best,
> > Jane
> >
> > On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]> wrote:
> >
> >> Thanks Jane for starting the discussion.
> >>
> >> Regarding #1, I also prefer `USE MODULES` syntax. It can be interpreted
> as
> >> "setting the current order of modules", which is similar to "setting the
> >> current catalog" for `USE CATALOG`.
> >>
> >> Regarding #3, I'm fine to map modules purely by name because I think it
> >> satisfies all the use cases we have at hand. But I guess we need to make
> >> sure we're backward compatible, i.e. users don't need to change their
> yaml
> >> files to configure the modules.
> >>
> >> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
> >>
> >>> Thanks Jane for the summary and starting the discussion in the mailing
> >>> list.
> >>>
> >>> Here are my thoughts:
> >>>
> >>> 1) syntax to reorder modules
> >>> I agree with Rui Li it would be quite useful if we can have some syntax
> >> to
> >>> reorder modules.
> >>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y, z`,
> >>> because USE has a more sense of effective and specifying ordering, than
> >>> RELOAD.
> >>>  From my feeling, RELOAD just means we unregister and register x,y,z
> >> modules
> >>> again,
> >>> it sounds like other registered modules are still in use and in the
> >> order.
> >>>
> >>> 3) mapping modules purely by name
> >>> This can definitely improve the usability of loading modules, because
> >>> the 'type=' property
> >>> looks really redundant. We can think of this as a syntax sugar that the
> >>> default type value is the module name.
> >>> And we can support to specify 'type=' property in the future to allow
> >>> multiple modules for one module type.
> >>>
> >>> Besides, I would like to mention one more change, that the module name
> >>> proposed in FLIP-68 is a string literal.
> >>> But I think we are all on the same page to change it into a simple
> >>> (non-compound) identifier.
> >>>
> >>> LOAD/UNLOAD MODULE 'core'
> >>> ==>
> >>> LOAD/UNLOAD MODULE core
> >>>
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>>
> >>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]> wrote:
> >>>
> >>>> Hi everyone,
> >>>>
> >>>> I would like to start a discussion on FLINK-21045 [1] about supporting
> >>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed by
> >>>> FLIP-68 [2] as following.
> >>>>
> >>>> -- load a module with the given name and append it to the end of the
> >>> module
> >>>> list
> >>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
> >>>>
> >>>> --unload a module by name from the module list and other modules
> remain
> >>> in
> >>>> the same relative positions
> >>>> UNLOAD MODULE 'name'
> >>>>
> >>>> After a round of discussion on the Jira ticket, it seems some
> >> unanswered
> >>>> questions need more opinions and suggestions.
> >>>>
> >>>> 1. The way to redefine resolution order easily
> >>>>
> >>>>      Rui Li suggested introducing `USE MODULES` and adding similar
> >>>> functionality to the API because
> >>>>
> >>>>>   1) It's very tedious to unload old modules just to reorder them.
> >>>>
> >>>>   2) Users may not even know how to "re-load" an old module if it was
> >> not
> >>>>> initially loaded by the user, e.g. don't know which type to use.
> >>>>
> >>>>
> >>>>      Jane Chan wondered that module is not like the catalog which has
> a
> >>>> concept of namespace could specify, and `USE` sounds like a
> >>>> mutual-exclusive concept.
> >>>>      Maybe `RELOAD MODULES` can express upgrading the priority of the
> >>> loaded
> >>>> module(s).
> >>>>
> >>>>
> >>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> >>>>      Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP MODULE`
> >>> instead
> >>>> of `LOAD/UNLOAD MODULE` because
> >>>>
> >>>>>   1) From a pure SQL user's perspective, maybe `CREATE MODULE + USE
> >>>> MODULE`
> >>>>> is easier to use rather than `LOAD/UNLOAD`.
> >>>>>   2) This will be very similar to what the catalog used now.
> >>>>
> >>>>
> >>>>    Timo Walther would rather stick to the agreed design because
> >>>> loading/unloading modules is a concept known from kernels etc.
> >>>>
> >>>> 3. Simplify the module design by mapping modules purely by name
> >>>>
> >>>> LOAD MODULE geo_utils
> >>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> >>> 'type='/'module='
> >>>> but allow only 1 module to be loaded parameterized
> >>>> UNLOAD hive
> >>>> USE MODULES hive, core
> >>>>
> >>>>
> >>>> Please find more details in the reference link. Looking forward to
> your
> >>>> feedback.
> >>>>
> >>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> >>>> <
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>
> >>>> [2]
> >>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>
> >>>> Best,
> >>>> Jane
> >>>>
> >>>
> >>
> >>
> >> --
> >> Best regards!
> >> Rui Li
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jark Wu-2
I agree with Timo that the USE implies the specified modules are in use in
the specified order and others are not used.
This would be easier to know what's the result list and order after the USE
statement.
That means: if current modules in order are x, y, z. And `USE MODULES z, y`
means current modules in order are z, y.

But I would like to not unload the unmentioned modules in the USE
statement. Because it seems strange that USE
will implicitly remove modules. In the above example, the user may type the
wrong modules list using USE by mistake
 and would like to declare the list again, the user has to create the
module again with some properties he may don't know. Therefore, I propose
the USE statement just specifies the current module lists and doesn't
unload modules.
Besides that, we may need a new syntax to list all the modules including
not used but loaded.
We can introduce SHOW FULL MODULES for this purpose with an additional
`used` column.

For example:

Flink SQL> list modules:
-----------
| modules |
-----------
| x       |
| y       |
| z       |
-----------
Flink SQL> USE MODULES z, y;
Flink SQL> show modules:
-----------
| modules |
-----------
| z       |
| y       |
-----------
Flink SQL> show FULL modules;
-------------------
| modules |  used |
-------------------
| z       | true  |
| y       | true  |
| x       | false |
-------------------
Flink SQL> USE MODULES z, y, x;
Flink SQL> show modules;
-----------
| modules |
-----------
| z       |
| y       |
| x       |
-----------

What do you think?

Best,
Jark

On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]> wrote:

> Hi Timo, thanks for the discussion.
>
> It seems to reach an agreement regarding #3 that <1> Module name should
> better be a simple identifier rather than a string literal. <2> Property
> `type` is redundant and should be removed, and mapping will rely on the
> module name because loading a module multiple times just using a different
> module name doesn't make much sense. <3> We should migrate to the newer API
> rather than the deprecated `TableFactory` class.
>
> Regarding #1, I think the point lies in whether changing the resolution
> order implies an `unload` operation explicitly (i.e., users could sense
> it). What do others think?
>
> Best,
> Jane
>
> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]> wrote:
>
> > IMHO I would rather unload the not mentioned modules. The statement
> > expresses `USE` that implicilty implies that the other modules are "not
> > used". What do others think?
> >
> > Regards,
> > Timo
> >
> >
> > On 01.02.21 11:28, Jane Chan wrote:
> > > Hi Jark and Rui,
> > >
> > > Thanks for the discussions.
> > >
> > > Regarding #1, I'm fine with `USE MODULES` syntax, and
> > >
> > >> It can be interpreted as "setting the current order of modules", which
> > is
> > >> similar to "setting the current catalog" for `USE CATALOG`.
> > >>
> > > I would like to confirm that the unmentioned modules remain in the same
> > > relative order? E.g., if there are three loaded modules `X`, `Y`, `Z`,
> > then
> > > `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> > >
> > > Regarding #3, I'm fine with mapping modules purely by name, and I think
> > > Jark raised a good point on making the module name a simple identifier
> > > instead of a string literal. For backward compatibility, since we
> haven't
> > > supported this syntax yet, the affected users are those who defined
> > modules
> > > in the YAML configuration file. Maybe we can eliminate the 'type' from
> > the
> > > 'requiredContext' to make it optional. Thus the proposed mapping
> > mechanism
> > > could use the module name to lookup the suitable factory,  and in the
> > > meanwhile updating documentation to encourage users to simplify their
> > YAML
> > > configuration. And in the long run, we can deprecate the 'type'.
> > >
> > > Best,
> > > Jane
> > >
> > > On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]> wrote:
> > >
> > >> Thanks Jane for starting the discussion.
> > >>
> > >> Regarding #1, I also prefer `USE MODULES` syntax. It can be
> interpreted
> > as
> > >> "setting the current order of modules", which is similar to "setting
> the
> > >> current catalog" for `USE CATALOG`.
> > >>
> > >> Regarding #3, I'm fine to map modules purely by name because I think
> it
> > >> satisfies all the use cases we have at hand. But I guess we need to
> make
> > >> sure we're backward compatible, i.e. users don't need to change their
> > yaml
> > >> files to configure the modules.
> > >>
> > >> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
> > >>
> > >>> Thanks Jane for the summary and starting the discussion in the
> mailing
> > >>> list.
> > >>>
> > >>> Here are my thoughts:
> > >>>
> > >>> 1) syntax to reorder modules
> > >>> I agree with Rui Li it would be quite useful if we can have some
> syntax
> > >> to
> > >>> reorder modules.
> > >>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y,
> z`,
> > >>> because USE has a more sense of effective and specifying ordering,
> than
> > >>> RELOAD.
> > >>>  From my feeling, RELOAD just means we unregister and register x,y,z
> > >> modules
> > >>> again,
> > >>> it sounds like other registered modules are still in use and in the
> > >> order.
> > >>>
> > >>> 3) mapping modules purely by name
> > >>> This can definitely improve the usability of loading modules, because
> > >>> the 'type=' property
> > >>> looks really redundant. We can think of this as a syntax sugar that
> the
> > >>> default type value is the module name.
> > >>> And we can support to specify 'type=' property in the future to allow
> > >>> multiple modules for one module type.
> > >>>
> > >>> Besides, I would like to mention one more change, that the module
> name
> > >>> proposed in FLIP-68 is a string literal.
> > >>> But I think we are all on the same page to change it into a simple
> > >>> (non-compound) identifier.
> > >>>
> > >>> LOAD/UNLOAD MODULE 'core'
> > >>> ==>
> > >>> LOAD/UNLOAD MODULE core
> > >>>
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>>
> > >>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]>
> wrote:
> > >>>
> > >>>> Hi everyone,
> > >>>>
> > >>>> I would like to start a discussion on FLINK-21045 [1] about
> supporting
> > >>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed by
> > >>>> FLIP-68 [2] as following.
> > >>>>
> > >>>> -- load a module with the given name and append it to the end of the
> > >>> module
> > >>>> list
> > >>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
> > >>>>
> > >>>> --unload a module by name from the module list and other modules
> > remain
> > >>> in
> > >>>> the same relative positions
> > >>>> UNLOAD MODULE 'name'
> > >>>>
> > >>>> After a round of discussion on the Jira ticket, it seems some
> > >> unanswered
> > >>>> questions need more opinions and suggestions.
> > >>>>
> > >>>> 1. The way to redefine resolution order easily
> > >>>>
> > >>>>      Rui Li suggested introducing `USE MODULES` and adding similar
> > >>>> functionality to the API because
> > >>>>
> > >>>>>   1) It's very tedious to unload old modules just to reorder them.
> > >>>>
> > >>>>   2) Users may not even know how to "re-load" an old module if it
> was
> > >> not
> > >>>>> initially loaded by the user, e.g. don't know which type to use.
> > >>>>
> > >>>>
> > >>>>      Jane Chan wondered that module is not like the catalog which
> has
> > a
> > >>>> concept of namespace could specify, and `USE` sounds like a
> > >>>> mutual-exclusive concept.
> > >>>>      Maybe `RELOAD MODULES` can express upgrading the priority of
> the
> > >>> loaded
> > >>>> module(s).
> > >>>>
> > >>>>
> > >>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> > >>>>      Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP MODULE`
> > >>> instead
> > >>>> of `LOAD/UNLOAD MODULE` because
> > >>>>
> > >>>>>   1) From a pure SQL user's perspective, maybe `CREATE MODULE + USE
> > >>>> MODULE`
> > >>>>> is easier to use rather than `LOAD/UNLOAD`.
> > >>>>>   2) This will be very similar to what the catalog used now.
> > >>>>
> > >>>>
> > >>>>    Timo Walther would rather stick to the agreed design because
> > >>>> loading/unloading modules is a concept known from kernels etc.
> > >>>>
> > >>>> 3. Simplify the module design by mapping modules purely by name
> > >>>>
> > >>>> LOAD MODULE geo_utils
> > >>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> > >>> 'type='/'module='
> > >>>> but allow only 1 module to be loaded parameterized
> > >>>> UNLOAD hive
> > >>>> USE MODULES hive, core
> > >>>>
> > >>>>
> > >>>> Please find more details in the reference link. Looking forward to
> > your
> > >>>> feedback.
> > >>>>
> > >>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> > >>>> <
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > >>>>>
> > >>>> [2]
> > >>>>
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > >>>>
> > >>>> Best,
> > >>>> Jane
> > >>>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Best regards!
> > >> Rui Li
> > >>
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Rui Li
If `USE MODULES` implies unloading modules that are not listed, does it
also imply loading modules that are not previously loaded, especially since
we're mapping modules by name now?

On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:

> I agree with Timo that the USE implies the specified modules are in use in
> the specified order and others are not used.
> This would be easier to know what's the result list and order after the USE
> statement.
> That means: if current modules in order are x, y, z. And `USE MODULES z, y`
> means current modules in order are z, y.
>
> But I would like to not unload the unmentioned modules in the USE
> statement. Because it seems strange that USE
> will implicitly remove modules. In the above example, the user may type the
> wrong modules list using USE by mistake
>  and would like to declare the list again, the user has to create the
> module again with some properties he may don't know. Therefore, I propose
> the USE statement just specifies the current module lists and doesn't
> unload modules.
> Besides that, we may need a new syntax to list all the modules including
> not used but loaded.
> We can introduce SHOW FULL MODULES for this purpose with an additional
> `used` column.
>
> For example:
>
> Flink SQL> list modules:
> -----------
> | modules |
> -----------
> | x       |
> | y       |
> | z       |
> -----------
> Flink SQL> USE MODULES z, y;
> Flink SQL> show modules:
> -----------
> | modules |
> -----------
> | z       |
> | y       |
> -----------
> Flink SQL> show FULL modules;
> -------------------
> | modules |  used |
> -------------------
> | z       | true  |
> | y       | true  |
> | x       | false |
> -------------------
> Flink SQL> USE MODULES z, y, x;
> Flink SQL> show modules;
> -----------
> | modules |
> -----------
> | z       |
> | y       |
> | x       |
> -----------
>
> What do you think?
>
> Best,
> Jark
>
> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]> wrote:
>
> > Hi Timo, thanks for the discussion.
> >
> > It seems to reach an agreement regarding #3 that <1> Module name should
> > better be a simple identifier rather than a string literal. <2> Property
> > `type` is redundant and should be removed, and mapping will rely on the
> > module name because loading a module multiple times just using a
> different
> > module name doesn't make much sense. <3> We should migrate to the newer
> API
> > rather than the deprecated `TableFactory` class.
> >
> > Regarding #1, I think the point lies in whether changing the resolution
> > order implies an `unload` operation explicitly (i.e., users could sense
> > it). What do others think?
> >
> > Best,
> > Jane
> >
> > On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]> wrote:
> >
> > > IMHO I would rather unload the not mentioned modules. The statement
> > > expresses `USE` that implicilty implies that the other modules are "not
> > > used". What do others think?
> > >
> > > Regards,
> > > Timo
> > >
> > >
> > > On 01.02.21 11:28, Jane Chan wrote:
> > > > Hi Jark and Rui,
> > > >
> > > > Thanks for the discussions.
> > > >
> > > > Regarding #1, I'm fine with `USE MODULES` syntax, and
> > > >
> > > >> It can be interpreted as "setting the current order of modules",
> which
> > > is
> > > >> similar to "setting the current catalog" for `USE CATALOG`.
> > > >>
> > > > I would like to confirm that the unmentioned modules remain in the
> same
> > > > relative order? E.g., if there are three loaded modules `X`, `Y`,
> `Z`,
> > > then
> > > > `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> > > >
> > > > Regarding #3, I'm fine with mapping modules purely by name, and I
> think
> > > > Jark raised a good point on making the module name a simple
> identifier
> > > > instead of a string literal. For backward compatibility, since we
> > haven't
> > > > supported this syntax yet, the affected users are those who defined
> > > modules
> > > > in the YAML configuration file. Maybe we can eliminate the 'type'
> from
> > > the
> > > > 'requiredContext' to make it optional. Thus the proposed mapping
> > > mechanism
> > > > could use the module name to lookup the suitable factory,  and in the
> > > > meanwhile updating documentation to encourage users to simplify their
> > > YAML
> > > > configuration. And in the long run, we can deprecate the 'type'.
> > > >
> > > > Best,
> > > > Jane
> > > >
> > > > On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]> wrote:
> > > >
> > > >> Thanks Jane for starting the discussion.
> > > >>
> > > >> Regarding #1, I also prefer `USE MODULES` syntax. It can be
> > interpreted
> > > as
> > > >> "setting the current order of modules", which is similar to "setting
> > the
> > > >> current catalog" for `USE CATALOG`.
> > > >>
> > > >> Regarding #3, I'm fine to map modules purely by name because I think
> > it
> > > >> satisfies all the use cases we have at hand. But I guess we need to
> > make
> > > >> sure we're backward compatible, i.e. users don't need to change
> their
> > > yaml
> > > >> files to configure the modules.
> > > >>
> > > >> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
> > > >>
> > > >>> Thanks Jane for the summary and starting the discussion in the
> > mailing
> > > >>> list.
> > > >>>
> > > >>> Here are my thoughts:
> > > >>>
> > > >>> 1) syntax to reorder modules
> > > >>> I agree with Rui Li it would be quite useful if we can have some
> > syntax
> > > >> to
> > > >>> reorder modules.
> > > >>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y,
> > z`,
> > > >>> because USE has a more sense of effective and specifying ordering,
> > than
> > > >>> RELOAD.
> > > >>>  From my feeling, RELOAD just means we unregister and register
> x,y,z
> > > >> modules
> > > >>> again,
> > > >>> it sounds like other registered modules are still in use and in the
> > > >> order.
> > > >>>
> > > >>> 3) mapping modules purely by name
> > > >>> This can definitely improve the usability of loading modules,
> because
> > > >>> the 'type=' property
> > > >>> looks really redundant. We can think of this as a syntax sugar that
> > the
> > > >>> default type value is the module name.
> > > >>> And we can support to specify 'type=' property in the future to
> allow
> > > >>> multiple modules for one module type.
> > > >>>
> > > >>> Besides, I would like to mention one more change, that the module
> > name
> > > >>> proposed in FLIP-68 is a string literal.
> > > >>> But I think we are all on the same page to change it into a simple
> > > >>> (non-compound) identifier.
> > > >>>
> > > >>> LOAD/UNLOAD MODULE 'core'
> > > >>> ==>
> > > >>> LOAD/UNLOAD MODULE core
> > > >>>
> > > >>>
> > > >>> Best,
> > > >>> Jark
> > > >>>
> > > >>>
> > > >>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]>
> > wrote:
> > > >>>
> > > >>>> Hi everyone,
> > > >>>>
> > > >>>> I would like to start a discussion on FLINK-21045 [1] about
> > supporting
> > > >>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed
> by
> > > >>>> FLIP-68 [2] as following.
> > > >>>>
> > > >>>> -- load a module with the given name and append it to the end of
> the
> > > >>> module
> > > >>>> list
> > > >>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
> > > >>>>
> > > >>>> --unload a module by name from the module list and other modules
> > > remain
> > > >>> in
> > > >>>> the same relative positions
> > > >>>> UNLOAD MODULE 'name'
> > > >>>>
> > > >>>> After a round of discussion on the Jira ticket, it seems some
> > > >> unanswered
> > > >>>> questions need more opinions and suggestions.
> > > >>>>
> > > >>>> 1. The way to redefine resolution order easily
> > > >>>>
> > > >>>>      Rui Li suggested introducing `USE MODULES` and adding similar
> > > >>>> functionality to the API because
> > > >>>>
> > > >>>>>   1) It's very tedious to unload old modules just to reorder
> them.
> > > >>>>
> > > >>>>   2) Users may not even know how to "re-load" an old module if it
> > was
> > > >> not
> > > >>>>> initially loaded by the user, e.g. don't know which type to use.
> > > >>>>
> > > >>>>
> > > >>>>      Jane Chan wondered that module is not like the catalog which
> > has
> > > a
> > > >>>> concept of namespace could specify, and `USE` sounds like a
> > > >>>> mutual-exclusive concept.
> > > >>>>      Maybe `RELOAD MODULES` can express upgrading the priority of
> > the
> > > >>> loaded
> > > >>>> module(s).
> > > >>>>
> > > >>>>
> > > >>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> > > >>>>      Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP
> MODULE`
> > > >>> instead
> > > >>>> of `LOAD/UNLOAD MODULE` because
> > > >>>>
> > > >>>>>   1) From a pure SQL user's perspective, maybe `CREATE MODULE +
> USE
> > > >>>> MODULE`
> > > >>>>> is easier to use rather than `LOAD/UNLOAD`.
> > > >>>>>   2) This will be very similar to what the catalog used now.
> > > >>>>
> > > >>>>
> > > >>>>    Timo Walther would rather stick to the agreed design because
> > > >>>> loading/unloading modules is a concept known from kernels etc.
> > > >>>>
> > > >>>> 3. Simplify the module design by mapping modules purely by name
> > > >>>>
> > > >>>> LOAD MODULE geo_utils
> > > >>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> > > >>> 'type='/'module='
> > > >>>> but allow only 1 module to be loaded parameterized
> > > >>>> UNLOAD hive
> > > >>>> USE MODULES hive, core
> > > >>>>
> > > >>>>
> > > >>>> Please find more details in the reference link. Looking forward to
> > > your
> > > >>>> feedback.
> > > >>>>
> > > >>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> > > >>>> <
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > > >>>>>
> > > >>>> [2]
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > > >>>>
> > > >>>> Best,
> > > >>>> Jane
> > > >>>>
> > > >>>
> > > >>
> > > >>
> > > >> --
> > > >> Best regards!
> > > >> Rui Li
> > > >>
> > > >
> > >
> > >
> >
>


--
Best regards!
Rui Li
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Timo Walther-2
+1 to Jark's proposal

I like the difference between just loading and actually enabling these
modules.

@Rui: I would use the same behavior as catalogs here. You cannot `USE` a
catalog without creating it before.

Another question is whether a LOAD operation also adds the module to the
enabled list by default?

Regards,
Timo

On 01.02.21 13:52, Rui Li wrote:

> If `USE MODULES` implies unloading modules that are not listed, does it
> also imply loading modules that are not previously loaded, especially since
> we're mapping modules by name now?
>
> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
>
>> I agree with Timo that the USE implies the specified modules are in use in
>> the specified order and others are not used.
>> This would be easier to know what's the result list and order after the USE
>> statement.
>> That means: if current modules in order are x, y, z. And `USE MODULES z, y`
>> means current modules in order are z, y.
>>
>> But I would like to not unload the unmentioned modules in the USE
>> statement. Because it seems strange that USE
>> will implicitly remove modules. In the above example, the user may type the
>> wrong modules list using USE by mistake
>>   and would like to declare the list again, the user has to create the
>> module again with some properties he may don't know. Therefore, I propose
>> the USE statement just specifies the current module lists and doesn't
>> unload modules.
>> Besides that, we may need a new syntax to list all the modules including
>> not used but loaded.
>> We can introduce SHOW FULL MODULES for this purpose with an additional
>> `used` column.
>>
>> For example:
>>
>> Flink SQL> list modules:
>> -----------
>> | modules |
>> -----------
>> | x       |
>> | y       |
>> | z       |
>> -----------
>> Flink SQL> USE MODULES z, y;
>> Flink SQL> show modules:
>> -----------
>> | modules |
>> -----------
>> | z       |
>> | y       |
>> -----------
>> Flink SQL> show FULL modules;
>> -------------------
>> | modules |  used |
>> -------------------
>> | z       | true  |
>> | y       | true  |
>> | x       | false |
>> -------------------
>> Flink SQL> USE MODULES z, y, x;
>> Flink SQL> show modules;
>> -----------
>> | modules |
>> -----------
>> | z       |
>> | y       |
>> | x       |
>> -----------
>>
>> What do you think?
>>
>> Best,
>> Jark
>>
>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]> wrote:
>>
>>> Hi Timo, thanks for the discussion.
>>>
>>> It seems to reach an agreement regarding #3 that <1> Module name should
>>> better be a simple identifier rather than a string literal. <2> Property
>>> `type` is redundant and should be removed, and mapping will rely on the
>>> module name because loading a module multiple times just using a
>> different
>>> module name doesn't make much sense. <3> We should migrate to the newer
>> API
>>> rather than the deprecated `TableFactory` class.
>>>
>>> Regarding #1, I think the point lies in whether changing the resolution
>>> order implies an `unload` operation explicitly (i.e., users could sense
>>> it). What do others think?
>>>
>>> Best,
>>> Jane
>>>
>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]> wrote:
>>>
>>>> IMHO I would rather unload the not mentioned modules. The statement
>>>> expresses `USE` that implicilty implies that the other modules are "not
>>>> used". What do others think?
>>>>
>>>> Regards,
>>>> Timo
>>>>
>>>>
>>>> On 01.02.21 11:28, Jane Chan wrote:
>>>>> Hi Jark and Rui,
>>>>>
>>>>> Thanks for the discussions.
>>>>>
>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
>>>>>
>>>>>> It can be interpreted as "setting the current order of modules",
>> which
>>>> is
>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
>>>>>>
>>>>> I would like to confirm that the unmentioned modules remain in the
>> same
>>>>> relative order? E.g., if there are three loaded modules `X`, `Y`,
>> `Z`,
>>>> then
>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
>>>>>
>>>>> Regarding #3, I'm fine with mapping modules purely by name, and I
>> think
>>>>> Jark raised a good point on making the module name a simple
>> identifier
>>>>> instead of a string literal. For backward compatibility, since we
>>> haven't
>>>>> supported this syntax yet, the affected users are those who defined
>>>> modules
>>>>> in the YAML configuration file. Maybe we can eliminate the 'type'
>> from
>>>> the
>>>>> 'requiredContext' to make it optional. Thus the proposed mapping
>>>> mechanism
>>>>> could use the module name to lookup the suitable factory,  and in the
>>>>> meanwhile updating documentation to encourage users to simplify their
>>>> YAML
>>>>> configuration. And in the long run, we can deprecate the 'type'.
>>>>>
>>>>> Best,
>>>>> Jane
>>>>>
>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]> wrote:
>>>>>
>>>>>> Thanks Jane for starting the discussion.
>>>>>>
>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
>>> interpreted
>>>> as
>>>>>> "setting the current order of modules", which is similar to "setting
>>> the
>>>>>> current catalog" for `USE CATALOG`.
>>>>>>
>>>>>> Regarding #3, I'm fine to map modules purely by name because I think
>>> it
>>>>>> satisfies all the use cases we have at hand. But I guess we need to
>>> make
>>>>>> sure we're backward compatible, i.e. users don't need to change
>> their
>>>> yaml
>>>>>> files to configure the modules.
>>>>>>
>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
>>>>>>
>>>>>>> Thanks Jane for the summary and starting the discussion in the
>>> mailing
>>>>>>> list.
>>>>>>>
>>>>>>> Here are my thoughts:
>>>>>>>
>>>>>>> 1) syntax to reorder modules
>>>>>>> I agree with Rui Li it would be quite useful if we can have some
>>> syntax
>>>>>> to
>>>>>>> reorder modules.
>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y,
>>> z`,
>>>>>>> because USE has a more sense of effective and specifying ordering,
>>> than
>>>>>>> RELOAD.
>>>>>>>   From my feeling, RELOAD just means we unregister and register
>> x,y,z
>>>>>> modules
>>>>>>> again,
>>>>>>> it sounds like other registered modules are still in use and in the
>>>>>> order.
>>>>>>>
>>>>>>> 3) mapping modules purely by name
>>>>>>> This can definitely improve the usability of loading modules,
>> because
>>>>>>> the 'type=' property
>>>>>>> looks really redundant. We can think of this as a syntax sugar that
>>> the
>>>>>>> default type value is the module name.
>>>>>>> And we can support to specify 'type=' property in the future to
>> allow
>>>>>>> multiple modules for one module type.
>>>>>>>
>>>>>>> Besides, I would like to mention one more change, that the module
>>> name
>>>>>>> proposed in FLIP-68 is a string literal.
>>>>>>> But I think we are all on the same page to change it into a simple
>>>>>>> (non-compound) identifier.
>>>>>>>
>>>>>>> LOAD/UNLOAD MODULE 'core'
>>>>>>> ==>
>>>>>>> LOAD/UNLOAD MODULE core
>>>>>>>
>>>>>>>
>>>>>>> Best,
>>>>>>> Jark
>>>>>>>
>>>>>>>
>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]>
>>> wrote:
>>>>>>>
>>>>>>>> Hi everyone,
>>>>>>>>
>>>>>>>> I would like to start a discussion on FLINK-21045 [1] about
>>> supporting
>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed
>> by
>>>>>>>> FLIP-68 [2] as following.
>>>>>>>>
>>>>>>>> -- load a module with the given name and append it to the end of
>> the
>>>>>>> module
>>>>>>>> list
>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
>>>>>>>>
>>>>>>>> --unload a module by name from the module list and other modules
>>>> remain
>>>>>>> in
>>>>>>>> the same relative positions
>>>>>>>> UNLOAD MODULE 'name'
>>>>>>>>
>>>>>>>> After a round of discussion on the Jira ticket, it seems some
>>>>>> unanswered
>>>>>>>> questions need more opinions and suggestions.
>>>>>>>>
>>>>>>>> 1. The way to redefine resolution order easily
>>>>>>>>
>>>>>>>>       Rui Li suggested introducing `USE MODULES` and adding similar
>>>>>>>> functionality to the API because
>>>>>>>>
>>>>>>>>>    1) It's very tedious to unload old modules just to reorder
>> them.
>>>>>>>>
>>>>>>>>    2) Users may not even know how to "re-load" an old module if it
>>> was
>>>>>> not
>>>>>>>>> initially loaded by the user, e.g. don't know which type to use.
>>>>>>>>
>>>>>>>>
>>>>>>>>       Jane Chan wondered that module is not like the catalog which
>>> has
>>>> a
>>>>>>>> concept of namespace could specify, and `USE` sounds like a
>>>>>>>> mutual-exclusive concept.
>>>>>>>>       Maybe `RELOAD MODULES` can express upgrading the priority of
>>> the
>>>>>>> loaded
>>>>>>>> module(s).
>>>>>>>>
>>>>>>>>
>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
>>>>>>>>       Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP
>> MODULE`
>>>>>>> instead
>>>>>>>> of `LOAD/UNLOAD MODULE` because
>>>>>>>>
>>>>>>>>>    1) From a pure SQL user's perspective, maybe `CREATE MODULE +
>> USE
>>>>>>>> MODULE`
>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
>>>>>>>>>    2) This will be very similar to what the catalog used now.
>>>>>>>>
>>>>>>>>
>>>>>>>>     Timo Walther would rather stick to the agreed design because
>>>>>>>> loading/unloading modules is a concept known from kernels etc.
>>>>>>>>
>>>>>>>> 3. Simplify the module design by mapping modules purely by name
>>>>>>>>
>>>>>>>> LOAD MODULE geo_utils
>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
>>>>>>> 'type='/'module='
>>>>>>>> but allow only 1 module to be loaded parameterized
>>>>>>>> UNLOAD hive
>>>>>>>> USE MODULES hive, core
>>>>>>>>
>>>>>>>>
>>>>>>>> Please find more details in the reference link. Looking forward to
>>>> your
>>>>>>>> feedback.
>>>>>>>>
>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
>>>>>>>> <
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>>>>>
>>>>>>>> [2]
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Jane
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards!
>>>>>> Rui Li
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jane Chan
+1 to Jark's proposal

 To make it clearer,  will `module#getFunctionDefinition()` return empty
suppose the module is loaded but not enabled?

Best,
Jane

On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]> wrote:

> +1 to Jark's proposal
>
> I like the difference between just loading and actually enabling these
> modules.
>
> @Rui: I would use the same behavior as catalogs here. You cannot `USE` a
> catalog without creating it before.
>
> Another question is whether a LOAD operation also adds the module to the
> enabled list by default?
>
> Regards,
> Timo
>
> On 01.02.21 13:52, Rui Li wrote:
> > If `USE MODULES` implies unloading modules that are not listed, does it
> > also imply loading modules that are not previously loaded, especially
> since
> > we're mapping modules by name now?
> >
> > On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
> >
> >> I agree with Timo that the USE implies the specified modules are in use
> in
> >> the specified order and others are not used.
> >> This would be easier to know what's the result list and order after the
> USE
> >> statement.
> >> That means: if current modules in order are x, y, z. And `USE MODULES
> z, y`
> >> means current modules in order are z, y.
> >>
> >> But I would like to not unload the unmentioned modules in the USE
> >> statement. Because it seems strange that USE
> >> will implicitly remove modules. In the above example, the user may type
> the
> >> wrong modules list using USE by mistake
> >>   and would like to declare the list again, the user has to create the
> >> module again with some properties he may don't know. Therefore, I
> propose
> >> the USE statement just specifies the current module lists and doesn't
> >> unload modules.
> >> Besides that, we may need a new syntax to list all the modules including
> >> not used but loaded.
> >> We can introduce SHOW FULL MODULES for this purpose with an additional
> >> `used` column.
> >>
> >> For example:
> >>
> >> Flink SQL> list modules:
> >> -----------
> >> | modules |
> >> -----------
> >> | x       |
> >> | y       |
> >> | z       |
> >> -----------
> >> Flink SQL> USE MODULES z, y;
> >> Flink SQL> show modules:
> >> -----------
> >> | modules |
> >> -----------
> >> | z       |
> >> | y       |
> >> -----------
> >> Flink SQL> show FULL modules;
> >> -------------------
> >> | modules |  used |
> >> -------------------
> >> | z       | true  |
> >> | y       | true  |
> >> | x       | false |
> >> -------------------
> >> Flink SQL> USE MODULES z, y, x;
> >> Flink SQL> show modules;
> >> -----------
> >> | modules |
> >> -----------
> >> | z       |
> >> | y       |
> >> | x       |
> >> -----------
> >>
> >> What do you think?
> >>
> >> Best,
> >> Jark
> >>
> >> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]> wrote:
> >>
> >>> Hi Timo, thanks for the discussion.
> >>>
> >>> It seems to reach an agreement regarding #3 that <1> Module name should
> >>> better be a simple identifier rather than a string literal. <2>
> Property
> >>> `type` is redundant and should be removed, and mapping will rely on the
> >>> module name because loading a module multiple times just using a
> >> different
> >>> module name doesn't make much sense. <3> We should migrate to the newer
> >> API
> >>> rather than the deprecated `TableFactory` class.
> >>>
> >>> Regarding #1, I think the point lies in whether changing the resolution
> >>> order implies an `unload` operation explicitly (i.e., users could sense
> >>> it). What do others think?
> >>>
> >>> Best,
> >>> Jane
> >>>
> >>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]>
> wrote:
> >>>
> >>>> IMHO I would rather unload the not mentioned modules. The statement
> >>>> expresses `USE` that implicilty implies that the other modules are
> "not
> >>>> used". What do others think?
> >>>>
> >>>> Regards,
> >>>> Timo
> >>>>
> >>>>
> >>>> On 01.02.21 11:28, Jane Chan wrote:
> >>>>> Hi Jark and Rui,
> >>>>>
> >>>>> Thanks for the discussions.
> >>>>>
> >>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
> >>>>>
> >>>>>> It can be interpreted as "setting the current order of modules",
> >> which
> >>>> is
> >>>>>> similar to "setting the current catalog" for `USE CATALOG`.
> >>>>>>
> >>>>> I would like to confirm that the unmentioned modules remain in the
> >> same
> >>>>> relative order? E.g., if there are three loaded modules `X`, `Y`,
> >> `Z`,
> >>>> then
> >>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> >>>>>
> >>>>> Regarding #3, I'm fine with mapping modules purely by name, and I
> >> think
> >>>>> Jark raised a good point on making the module name a simple
> >> identifier
> >>>>> instead of a string literal. For backward compatibility, since we
> >>> haven't
> >>>>> supported this syntax yet, the affected users are those who defined
> >>>> modules
> >>>>> in the YAML configuration file. Maybe we can eliminate the 'type'
> >> from
> >>>> the
> >>>>> 'requiredContext' to make it optional. Thus the proposed mapping
> >>>> mechanism
> >>>>> could use the module name to lookup the suitable factory,  and in the
> >>>>> meanwhile updating documentation to encourage users to simplify their
> >>>> YAML
> >>>>> configuration. And in the long run, we can deprecate the 'type'.
> >>>>>
> >>>>> Best,
> >>>>> Jane
> >>>>>
> >>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]> wrote:
> >>>>>
> >>>>>> Thanks Jane for starting the discussion.
> >>>>>>
> >>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
> >>> interpreted
> >>>> as
> >>>>>> "setting the current order of modules", which is similar to "setting
> >>> the
> >>>>>> current catalog" for `USE CATALOG`.
> >>>>>>
> >>>>>> Regarding #3, I'm fine to map modules purely by name because I think
> >>> it
> >>>>>> satisfies all the use cases we have at hand. But I guess we need to
> >>> make
> >>>>>> sure we're backward compatible, i.e. users don't need to change
> >> their
> >>>> yaml
> >>>>>> files to configure the modules.
> >>>>>>
> >>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
> >>>>>>
> >>>>>>> Thanks Jane for the summary and starting the discussion in the
> >>> mailing
> >>>>>>> list.
> >>>>>>>
> >>>>>>> Here are my thoughts:
> >>>>>>>
> >>>>>>> 1) syntax to reorder modules
> >>>>>>> I agree with Rui Li it would be quite useful if we can have some
> >>> syntax
> >>>>>> to
> >>>>>>> reorder modules.
> >>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y,
> >>> z`,
> >>>>>>> because USE has a more sense of effective and specifying ordering,
> >>> than
> >>>>>>> RELOAD.
> >>>>>>>   From my feeling, RELOAD just means we unregister and register
> >> x,y,z
> >>>>>> modules
> >>>>>>> again,
> >>>>>>> it sounds like other registered modules are still in use and in the
> >>>>>> order.
> >>>>>>>
> >>>>>>> 3) mapping modules purely by name
> >>>>>>> This can definitely improve the usability of loading modules,
> >> because
> >>>>>>> the 'type=' property
> >>>>>>> looks really redundant. We can think of this as a syntax sugar that
> >>> the
> >>>>>>> default type value is the module name.
> >>>>>>> And we can support to specify 'type=' property in the future to
> >> allow
> >>>>>>> multiple modules for one module type.
> >>>>>>>
> >>>>>>> Besides, I would like to mention one more change, that the module
> >>> name
> >>>>>>> proposed in FLIP-68 is a string literal.
> >>>>>>> But I think we are all on the same page to change it into a simple
> >>>>>>> (non-compound) identifier.
> >>>>>>>
> >>>>>>> LOAD/UNLOAD MODULE 'core'
> >>>>>>> ==>
> >>>>>>> LOAD/UNLOAD MODULE core
> >>>>>>>
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Jark
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]>
> >>> wrote:
> >>>>>>>
> >>>>>>>> Hi everyone,
> >>>>>>>>
> >>>>>>>> I would like to start a discussion on FLINK-21045 [1] about
> >>> supporting
> >>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed
> >> by
> >>>>>>>> FLIP-68 [2] as following.
> >>>>>>>>
> >>>>>>>> -- load a module with the given name and append it to the end of
> >> the
> >>>>>>> module
> >>>>>>>> list
> >>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
> >>>>>>>>
> >>>>>>>> --unload a module by name from the module list and other modules
> >>>> remain
> >>>>>>> in
> >>>>>>>> the same relative positions
> >>>>>>>> UNLOAD MODULE 'name'
> >>>>>>>>
> >>>>>>>> After a round of discussion on the Jira ticket, it seems some
> >>>>>> unanswered
> >>>>>>>> questions need more opinions and suggestions.
> >>>>>>>>
> >>>>>>>> 1. The way to redefine resolution order easily
> >>>>>>>>
> >>>>>>>>       Rui Li suggested introducing `USE MODULES` and adding
> similar
> >>>>>>>> functionality to the API because
> >>>>>>>>
> >>>>>>>>>    1) It's very tedious to unload old modules just to reorder
> >> them.
> >>>>>>>>
> >>>>>>>>    2) Users may not even know how to "re-load" an old module if it
> >>> was
> >>>>>> not
> >>>>>>>>> initially loaded by the user, e.g. don't know which type to use.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>       Jane Chan wondered that module is not like the catalog which
> >>> has
> >>>> a
> >>>>>>>> concept of namespace could specify, and `USE` sounds like a
> >>>>>>>> mutual-exclusive concept.
> >>>>>>>>       Maybe `RELOAD MODULES` can express upgrading the priority of
> >>> the
> >>>>>>> loaded
> >>>>>>>> module(s).
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> >>>>>>>>       Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP
> >> MODULE`
> >>>>>>> instead
> >>>>>>>> of `LOAD/UNLOAD MODULE` because
> >>>>>>>>
> >>>>>>>>>    1) From a pure SQL user's perspective, maybe `CREATE MODULE +
> >> USE
> >>>>>>>> MODULE`
> >>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
> >>>>>>>>>    2) This will be very similar to what the catalog used now.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>     Timo Walther would rather stick to the agreed design because
> >>>>>>>> loading/unloading modules is a concept known from kernels etc.
> >>>>>>>>
> >>>>>>>> 3. Simplify the module design by mapping modules purely by name
> >>>>>>>>
> >>>>>>>> LOAD MODULE geo_utils
> >>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> >>>>>>> 'type='/'module='
> >>>>>>>> but allow only 1 module to be loaded parameterized
> >>>>>>>> UNLOAD hive
> >>>>>>>> USE MODULES hive, core
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Please find more details in the reference link. Looking forward to
> >>>> your
> >>>>>>>> feedback.
> >>>>>>>>
> >>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> >>>>>>>> <
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>>>>>
> >>>>>>>> [2]
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Jane
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best regards!
> >>>>>> Rui Li
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Timo Walther-2
Not the module itself but the ModuleManager should handle this case, yes.

Regards,
Timo


On 01.02.21 17:35, Jane Chan wrote:

> +1 to Jark's proposal
>
>   To make it clearer,  will `module#getFunctionDefinition()` return empty
> suppose the module is loaded but not enabled?
>
> Best,
> Jane
>
> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]> wrote:
>
>> +1 to Jark's proposal
>>
>> I like the difference between just loading and actually enabling these
>> modules.
>>
>> @Rui: I would use the same behavior as catalogs here. You cannot `USE` a
>> catalog without creating it before.
>>
>> Another question is whether a LOAD operation also adds the module to the
>> enabled list by default?
>>
>> Regards,
>> Timo
>>
>> On 01.02.21 13:52, Rui Li wrote:
>>> If `USE MODULES` implies unloading modules that are not listed, does it
>>> also imply loading modules that are not previously loaded, especially
>> since
>>> we're mapping modules by name now?
>>>
>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
>>>
>>>> I agree with Timo that the USE implies the specified modules are in use
>> in
>>>> the specified order and others are not used.
>>>> This would be easier to know what's the result list and order after the
>> USE
>>>> statement.
>>>> That means: if current modules in order are x, y, z. And `USE MODULES
>> z, y`
>>>> means current modules in order are z, y.
>>>>
>>>> But I would like to not unload the unmentioned modules in the USE
>>>> statement. Because it seems strange that USE
>>>> will implicitly remove modules. In the above example, the user may type
>> the
>>>> wrong modules list using USE by mistake
>>>>    and would like to declare the list again, the user has to create the
>>>> module again with some properties he may don't know. Therefore, I
>> propose
>>>> the USE statement just specifies the current module lists and doesn't
>>>> unload modules.
>>>> Besides that, we may need a new syntax to list all the modules including
>>>> not used but loaded.
>>>> We can introduce SHOW FULL MODULES for this purpose with an additional
>>>> `used` column.
>>>>
>>>> For example:
>>>>
>>>> Flink SQL> list modules:
>>>> -----------
>>>> | modules |
>>>> -----------
>>>> | x       |
>>>> | y       |
>>>> | z       |
>>>> -----------
>>>> Flink SQL> USE MODULES z, y;
>>>> Flink SQL> show modules:
>>>> -----------
>>>> | modules |
>>>> -----------
>>>> | z       |
>>>> | y       |
>>>> -----------
>>>> Flink SQL> show FULL modules;
>>>> -------------------
>>>> | modules |  used |
>>>> -------------------
>>>> | z       | true  |
>>>> | y       | true  |
>>>> | x       | false |
>>>> -------------------
>>>> Flink SQL> USE MODULES z, y, x;
>>>> Flink SQL> show modules;
>>>> -----------
>>>> | modules |
>>>> -----------
>>>> | z       |
>>>> | y       |
>>>> | x       |
>>>> -----------
>>>>
>>>> What do you think?
>>>>
>>>> Best,
>>>> Jark
>>>>
>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]> wrote:
>>>>
>>>>> Hi Timo, thanks for the discussion.
>>>>>
>>>>> It seems to reach an agreement regarding #3 that <1> Module name should
>>>>> better be a simple identifier rather than a string literal. <2>
>> Property
>>>>> `type` is redundant and should be removed, and mapping will rely on the
>>>>> module name because loading a module multiple times just using a
>>>> different
>>>>> module name doesn't make much sense. <3> We should migrate to the newer
>>>> API
>>>>> rather than the deprecated `TableFactory` class.
>>>>>
>>>>> Regarding #1, I think the point lies in whether changing the resolution
>>>>> order implies an `unload` operation explicitly (i.e., users could sense
>>>>> it). What do others think?
>>>>>
>>>>> Best,
>>>>> Jane
>>>>>
>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]>
>> wrote:
>>>>>
>>>>>> IMHO I would rather unload the not mentioned modules. The statement
>>>>>> expresses `USE` that implicilty implies that the other modules are
>> "not
>>>>>> used". What do others think?
>>>>>>
>>>>>> Regards,
>>>>>> Timo
>>>>>>
>>>>>>
>>>>>> On 01.02.21 11:28, Jane Chan wrote:
>>>>>>> Hi Jark and Rui,
>>>>>>>
>>>>>>> Thanks for the discussions.
>>>>>>>
>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
>>>>>>>
>>>>>>>> It can be interpreted as "setting the current order of modules",
>>>> which
>>>>>> is
>>>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
>>>>>>>>
>>>>>>> I would like to confirm that the unmentioned modules remain in the
>>>> same
>>>>>>> relative order? E.g., if there are three loaded modules `X`, `Y`,
>>>> `Z`,
>>>>>> then
>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
>>>>>>>
>>>>>>> Regarding #3, I'm fine with mapping modules purely by name, and I
>>>> think
>>>>>>> Jark raised a good point on making the module name a simple
>>>> identifier
>>>>>>> instead of a string literal. For backward compatibility, since we
>>>>> haven't
>>>>>>> supported this syntax yet, the affected users are those who defined
>>>>>> modules
>>>>>>> in the YAML configuration file. Maybe we can eliminate the 'type'
>>>> from
>>>>>> the
>>>>>>> 'requiredContext' to make it optional. Thus the proposed mapping
>>>>>> mechanism
>>>>>>> could use the module name to lookup the suitable factory,  and in the
>>>>>>> meanwhile updating documentation to encourage users to simplify their
>>>>>> YAML
>>>>>>> configuration. And in the long run, we can deprecate the 'type'.
>>>>>>>
>>>>>>> Best,
>>>>>>> Jane
>>>>>>>
>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]> wrote:
>>>>>>>
>>>>>>>> Thanks Jane for starting the discussion.
>>>>>>>>
>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
>>>>> interpreted
>>>>>> as
>>>>>>>> "setting the current order of modules", which is similar to "setting
>>>>> the
>>>>>>>> current catalog" for `USE CATALOG`.
>>>>>>>>
>>>>>>>> Regarding #3, I'm fine to map modules purely by name because I think
>>>>> it
>>>>>>>> satisfies all the use cases we have at hand. But I guess we need to
>>>>> make
>>>>>>>> sure we're backward compatible, i.e. users don't need to change
>>>> their
>>>>>> yaml
>>>>>>>> files to configure the modules.
>>>>>>>>
>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
>>>>>>>>
>>>>>>>>> Thanks Jane for the summary and starting the discussion in the
>>>>> mailing
>>>>>>>>> list.
>>>>>>>>>
>>>>>>>>> Here are my thoughts:
>>>>>>>>>
>>>>>>>>> 1) syntax to reorder modules
>>>>>>>>> I agree with Rui Li it would be quite useful if we can have some
>>>>> syntax
>>>>>>>> to
>>>>>>>>> reorder modules.
>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x, y,
>>>>> z`,
>>>>>>>>> because USE has a more sense of effective and specifying ordering,
>>>>> than
>>>>>>>>> RELOAD.
>>>>>>>>>    From my feeling, RELOAD just means we unregister and register
>>>> x,y,z
>>>>>>>> modules
>>>>>>>>> again,
>>>>>>>>> it sounds like other registered modules are still in use and in the
>>>>>>>> order.
>>>>>>>>>
>>>>>>>>> 3) mapping modules purely by name
>>>>>>>>> This can definitely improve the usability of loading modules,
>>>> because
>>>>>>>>> the 'type=' property
>>>>>>>>> looks really redundant. We can think of this as a syntax sugar that
>>>>> the
>>>>>>>>> default type value is the module name.
>>>>>>>>> And we can support to specify 'type=' property in the future to
>>>> allow
>>>>>>>>> multiple modules for one module type.
>>>>>>>>>
>>>>>>>>> Besides, I would like to mention one more change, that the module
>>>>> name
>>>>>>>>> proposed in FLIP-68 is a string literal.
>>>>>>>>> But I think we are all on the same page to change it into a simple
>>>>>>>>> (non-compound) identifier.
>>>>>>>>>
>>>>>>>>> LOAD/UNLOAD MODULE 'core'
>>>>>>>>> ==>
>>>>>>>>> LOAD/UNLOAD MODULE core
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Jark
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]>
>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi everyone,
>>>>>>>>>>
>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1] about
>>>>> supporting
>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first proposed
>>>> by
>>>>>>>>>> FLIP-68 [2] as following.
>>>>>>>>>>
>>>>>>>>>> -- load a module with the given name and append it to the end of
>>>> the
>>>>>>>>> module
>>>>>>>>>> list
>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
>>>>>>>>>>
>>>>>>>>>> --unload a module by name from the module list and other modules
>>>>>> remain
>>>>>>>>> in
>>>>>>>>>> the same relative positions
>>>>>>>>>> UNLOAD MODULE 'name'
>>>>>>>>>>
>>>>>>>>>> After a round of discussion on the Jira ticket, it seems some
>>>>>>>> unanswered
>>>>>>>>>> questions need more opinions and suggestions.
>>>>>>>>>>
>>>>>>>>>> 1. The way to redefine resolution order easily
>>>>>>>>>>
>>>>>>>>>>        Rui Li suggested introducing `USE MODULES` and adding
>> similar
>>>>>>>>>> functionality to the API because
>>>>>>>>>>
>>>>>>>>>>>     1) It's very tedious to unload old modules just to reorder
>>>> them.
>>>>>>>>>>
>>>>>>>>>>     2) Users may not even know how to "re-load" an old module if it
>>>>> was
>>>>>>>> not
>>>>>>>>>>> initially loaded by the user, e.g. don't know which type to use.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>        Jane Chan wondered that module is not like the catalog which
>>>>> has
>>>>>> a
>>>>>>>>>> concept of namespace could specify, and `USE` sounds like a
>>>>>>>>>> mutual-exclusive concept.
>>>>>>>>>>        Maybe `RELOAD MODULES` can express upgrading the priority of
>>>>> the
>>>>>>>>> loaded
>>>>>>>>>> module(s).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
>>>>>>>>>>        Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP
>>>> MODULE`
>>>>>>>>> instead
>>>>>>>>>> of `LOAD/UNLOAD MODULE` because
>>>>>>>>>>
>>>>>>>>>>>     1) From a pure SQL user's perspective, maybe `CREATE MODULE +
>>>> USE
>>>>>>>>>> MODULE`
>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
>>>>>>>>>>>     2) This will be very similar to what the catalog used now.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>      Timo Walther would rather stick to the agreed design because
>>>>>>>>>> loading/unloading modules is a concept known from kernels etc.
>>>>>>>>>>
>>>>>>>>>> 3. Simplify the module design by mapping modules purely by name
>>>>>>>>>>
>>>>>>>>>> LOAD MODULE geo_utils
>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
>>>>>>>>> 'type='/'module='
>>>>>>>>>> but allow only 1 module to be loaded parameterized
>>>>>>>>>> UNLOAD hive
>>>>>>>>>> USE MODULES hive, core
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please find more details in the reference link. Looking forward to
>>>>>> your
>>>>>>>>>> feedback.
>>>>>>>>>>
>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
>>>>>>>>>> <
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>>>>>>>
>>>>>>>>>> [2]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Jane
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Best regards!
>>>>>>>> Rui Li
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jark Wu-2
Hi Timo,

> Another question is whether a LOAD operation also adds the module to the
enabled list by default?

I would like to add the module to the enabled list by default, the main
reasons are:
1) Reordering is an advanced requirement, adding modules needs additional
USE statements with "core" module
 sounds too burdensome. Most users should be satisfied with only LOAD
statements.
2) We should keep compatible for TableEnvironment#loadModule().
3) We are using the LOAD statement instead of CREATE, so I think it's fine
that it does some implicit things.

Best,
Jark

On Tue, 2 Feb 2021 at 00:48, Timo Walther <[hidden email]> wrote:

> Not the module itself but the ModuleManager should handle this case, yes.
>
> Regards,
> Timo
>
>
> On 01.02.21 17:35, Jane Chan wrote:
> > +1 to Jark's proposal
> >
> >   To make it clearer,  will `module#getFunctionDefinition()` return empty
> > suppose the module is loaded but not enabled?
> >
> > Best,
> > Jane
> >
> > On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]> wrote:
> >
> >> +1 to Jark's proposal
> >>
> >> I like the difference between just loading and actually enabling these
> >> modules.
> >>
> >> @Rui: I would use the same behavior as catalogs here. You cannot `USE` a
> >> catalog without creating it before.
> >>
> >> Another question is whether a LOAD operation also adds the module to the
> >> enabled list by default?
> >>
> >> Regards,
> >> Timo
> >>
> >> On 01.02.21 13:52, Rui Li wrote:
> >>> If `USE MODULES` implies unloading modules that are not listed, does it
> >>> also imply loading modules that are not previously loaded, especially
> >> since
> >>> we're mapping modules by name now?
> >>>
> >>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
> >>>
> >>>> I agree with Timo that the USE implies the specified modules are in
> use
> >> in
> >>>> the specified order and others are not used.
> >>>> This would be easier to know what's the result list and order after
> the
> >> USE
> >>>> statement.
> >>>> That means: if current modules in order are x, y, z. And `USE MODULES
> >> z, y`
> >>>> means current modules in order are z, y.
> >>>>
> >>>> But I would like to not unload the unmentioned modules in the USE
> >>>> statement. Because it seems strange that USE
> >>>> will implicitly remove modules. In the above example, the user may
> type
> >> the
> >>>> wrong modules list using USE by mistake
> >>>>    and would like to declare the list again, the user has to create
> the
> >>>> module again with some properties he may don't know. Therefore, I
> >> propose
> >>>> the USE statement just specifies the current module lists and doesn't
> >>>> unload modules.
> >>>> Besides that, we may need a new syntax to list all the modules
> including
> >>>> not used but loaded.
> >>>> We can introduce SHOW FULL MODULES for this purpose with an additional
> >>>> `used` column.
> >>>>
> >>>> For example:
> >>>>
> >>>> Flink SQL> list modules:
> >>>> -----------
> >>>> | modules |
> >>>> -----------
> >>>> | x       |
> >>>> | y       |
> >>>> | z       |
> >>>> -----------
> >>>> Flink SQL> USE MODULES z, y;
> >>>> Flink SQL> show modules:
> >>>> -----------
> >>>> | modules |
> >>>> -----------
> >>>> | z       |
> >>>> | y       |
> >>>> -----------
> >>>> Flink SQL> show FULL modules;
> >>>> -------------------
> >>>> | modules |  used |
> >>>> -------------------
> >>>> | z       | true  |
> >>>> | y       | true  |
> >>>> | x       | false |
> >>>> -------------------
> >>>> Flink SQL> USE MODULES z, y, x;
> >>>> Flink SQL> show modules;
> >>>> -----------
> >>>> | modules |
> >>>> -----------
> >>>> | z       |
> >>>> | y       |
> >>>> | x       |
> >>>> -----------
> >>>>
> >>>> What do you think?
> >>>>
> >>>> Best,
> >>>> Jark
> >>>>
> >>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]> wrote:
> >>>>
> >>>>> Hi Timo, thanks for the discussion.
> >>>>>
> >>>>> It seems to reach an agreement regarding #3 that <1> Module name
> should
> >>>>> better be a simple identifier rather than a string literal. <2>
> >> Property
> >>>>> `type` is redundant and should be removed, and mapping will rely on
> the
> >>>>> module name because loading a module multiple times just using a
> >>>> different
> >>>>> module name doesn't make much sense. <3> We should migrate to the
> newer
> >>>> API
> >>>>> rather than the deprecated `TableFactory` class.
> >>>>>
> >>>>> Regarding #1, I think the point lies in whether changing the
> resolution
> >>>>> order implies an `unload` operation explicitly (i.e., users could
> sense
> >>>>> it). What do others think?
> >>>>>
> >>>>> Best,
> >>>>> Jane
> >>>>>
> >>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]>
> >> wrote:
> >>>>>
> >>>>>> IMHO I would rather unload the not mentioned modules. The statement
> >>>>>> expresses `USE` that implicilty implies that the other modules are
> >> "not
> >>>>>> used". What do others think?
> >>>>>>
> >>>>>> Regards,
> >>>>>> Timo
> >>>>>>
> >>>>>>
> >>>>>> On 01.02.21 11:28, Jane Chan wrote:
> >>>>>>> Hi Jark and Rui,
> >>>>>>>
> >>>>>>> Thanks for the discussions.
> >>>>>>>
> >>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
> >>>>>>>
> >>>>>>>> It can be interpreted as "setting the current order of modules",
> >>>> which
> >>>>>> is
> >>>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
> >>>>>>>>
> >>>>>>> I would like to confirm that the unmentioned modules remain in the
> >>>> same
> >>>>>>> relative order? E.g., if there are three loaded modules `X`, `Y`,
> >>>> `Z`,
> >>>>>> then
> >>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> >>>>>>>
> >>>>>>> Regarding #3, I'm fine with mapping modules purely by name, and I
> >>>> think
> >>>>>>> Jark raised a good point on making the module name a simple
> >>>> identifier
> >>>>>>> instead of a string literal. For backward compatibility, since we
> >>>>> haven't
> >>>>>>> supported this syntax yet, the affected users are those who defined
> >>>>>> modules
> >>>>>>> in the YAML configuration file. Maybe we can eliminate the 'type'
> >>>> from
> >>>>>> the
> >>>>>>> 'requiredContext' to make it optional. Thus the proposed mapping
> >>>>>> mechanism
> >>>>>>> could use the module name to lookup the suitable factory,  and in
> the
> >>>>>>> meanwhile updating documentation to encourage users to simplify
> their
> >>>>>> YAML
> >>>>>>> configuration. And in the long run, we can deprecate the 'type'.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Jane
> >>>>>>>
> >>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]>
> wrote:
> >>>>>>>
> >>>>>>>> Thanks Jane for starting the discussion.
> >>>>>>>>
> >>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
> >>>>> interpreted
> >>>>>> as
> >>>>>>>> "setting the current order of modules", which is similar to
> "setting
> >>>>> the
> >>>>>>>> current catalog" for `USE CATALOG`.
> >>>>>>>>
> >>>>>>>> Regarding #3, I'm fine to map modules purely by name because I
> think
> >>>>> it
> >>>>>>>> satisfies all the use cases we have at hand. But I guess we need
> to
> >>>>> make
> >>>>>>>> sure we're backward compatible, i.e. users don't need to change
> >>>> their
> >>>>>> yaml
> >>>>>>>> files to configure the modules.
> >>>>>>>>
> >>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks Jane for the summary and starting the discussion in the
> >>>>> mailing
> >>>>>>>>> list.
> >>>>>>>>>
> >>>>>>>>> Here are my thoughts:
> >>>>>>>>>
> >>>>>>>>> 1) syntax to reorder modules
> >>>>>>>>> I agree with Rui Li it would be quite useful if we can have some
> >>>>> syntax
> >>>>>>>> to
> >>>>>>>>> reorder modules.
> >>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x,
> y,
> >>>>> z`,
> >>>>>>>>> because USE has a more sense of effective and specifying
> ordering,
> >>>>> than
> >>>>>>>>> RELOAD.
> >>>>>>>>>    From my feeling, RELOAD just means we unregister and register
> >>>> x,y,z
> >>>>>>>> modules
> >>>>>>>>> again,
> >>>>>>>>> it sounds like other registered modules are still in use and in
> the
> >>>>>>>> order.
> >>>>>>>>>
> >>>>>>>>> 3) mapping modules purely by name
> >>>>>>>>> This can definitely improve the usability of loading modules,
> >>>> because
> >>>>>>>>> the 'type=' property
> >>>>>>>>> looks really redundant. We can think of this as a syntax sugar
> that
> >>>>> the
> >>>>>>>>> default type value is the module name.
> >>>>>>>>> And we can support to specify 'type=' property in the future to
> >>>> allow
> >>>>>>>>> multiple modules for one module type.
> >>>>>>>>>
> >>>>>>>>> Besides, I would like to mention one more change, that the module
> >>>>> name
> >>>>>>>>> proposed in FLIP-68 is a string literal.
> >>>>>>>>> But I think we are all on the same page to change it into a
> simple
> >>>>>>>>> (non-compound) identifier.
> >>>>>>>>>
> >>>>>>>>> LOAD/UNLOAD MODULE 'core'
> >>>>>>>>> ==>
> >>>>>>>>> LOAD/UNLOAD MODULE core
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Jark
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]>
> >>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi everyone,
> >>>>>>>>>>
> >>>>>>>>>> I would like to start a discussion on FLINK-21045 [1] about
> >>>>> supporting
> >>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first
> proposed
> >>>> by
> >>>>>>>>>> FLIP-68 [2] as following.
> >>>>>>>>>>
> >>>>>>>>>> -- load a module with the given name and append it to the end of
> >>>> the
> >>>>>>>>> module
> >>>>>>>>>> list
> >>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
> >>>>>>>>>>
> >>>>>>>>>> --unload a module by name from the module list and other modules
> >>>>>> remain
> >>>>>>>>> in
> >>>>>>>>>> the same relative positions
> >>>>>>>>>> UNLOAD MODULE 'name'
> >>>>>>>>>>
> >>>>>>>>>> After a round of discussion on the Jira ticket, it seems some
> >>>>>>>> unanswered
> >>>>>>>>>> questions need more opinions and suggestions.
> >>>>>>>>>>
> >>>>>>>>>> 1. The way to redefine resolution order easily
> >>>>>>>>>>
> >>>>>>>>>>        Rui Li suggested introducing `USE MODULES` and adding
> >> similar
> >>>>>>>>>> functionality to the API because
> >>>>>>>>>>
> >>>>>>>>>>>     1) It's very tedious to unload old modules just to reorder
> >>>> them.
> >>>>>>>>>>
> >>>>>>>>>>     2) Users may not even know how to "re-load" an old module
> if it
> >>>>> was
> >>>>>>>> not
> >>>>>>>>>>> initially loaded by the user, e.g. don't know which type to
> use.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>        Jane Chan wondered that module is not like the catalog
> which
> >>>>> has
> >>>>>> a
> >>>>>>>>>> concept of namespace could specify, and `USE` sounds like a
> >>>>>>>>>> mutual-exclusive concept.
> >>>>>>>>>>        Maybe `RELOAD MODULES` can express upgrading the
> priority of
> >>>>> the
> >>>>>>>>> loaded
> >>>>>>>>>> module(s).
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> >>>>>>>>>>        Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP
> >>>> MODULE`
> >>>>>>>>> instead
> >>>>>>>>>> of `LOAD/UNLOAD MODULE` because
> >>>>>>>>>>
> >>>>>>>>>>>     1) From a pure SQL user's perspective, maybe `CREATE
> MODULE +
> >>>> USE
> >>>>>>>>>> MODULE`
> >>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
> >>>>>>>>>>>     2) This will be very similar to what the catalog used now.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>      Timo Walther would rather stick to the agreed design
> because
> >>>>>>>>>> loading/unloading modules is a concept known from kernels etc.
> >>>>>>>>>>
> >>>>>>>>>> 3. Simplify the module design by mapping modules purely by name
> >>>>>>>>>>
> >>>>>>>>>> LOAD MODULE geo_utils
> >>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> >>>>>>>>> 'type='/'module='
> >>>>>>>>>> but allow only 1 module to be loaded parameterized
> >>>>>>>>>> UNLOAD hive
> >>>>>>>>>> USE MODULES hive, core
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Please find more details in the reference link. Looking forward
> to
> >>>>>> your
> >>>>>>>>>> feedback.
> >>>>>>>>>>
> >>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> >>>>>>>>>> <
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>>>>>>>
> >>>>>>>>>> [2]
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Jane
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> Best regards!
> >>>>>>>> Rui Li
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Timo Walther-2
+1

@Jane Can you summarize our discussion in the JIRA issue?

Thanks,
Timo


On 02.02.21 03:50, Jark Wu wrote:

> Hi Timo,
>
>> Another question is whether a LOAD operation also adds the module to the
> enabled list by default?
>
> I would like to add the module to the enabled list by default, the main
> reasons are:
> 1) Reordering is an advanced requirement, adding modules needs additional
> USE statements with "core" module
>   sounds too burdensome. Most users should be satisfied with only LOAD
> statements.
> 2) We should keep compatible for TableEnvironment#loadModule().
> 3) We are using the LOAD statement instead of CREATE, so I think it's fine
> that it does some implicit things.
>
> Best,
> Jark
>
> On Tue, 2 Feb 2021 at 00:48, Timo Walther <[hidden email]> wrote:
>
>> Not the module itself but the ModuleManager should handle this case, yes.
>>
>> Regards,
>> Timo
>>
>>
>> On 01.02.21 17:35, Jane Chan wrote:
>>> +1 to Jark's proposal
>>>
>>>    To make it clearer,  will `module#getFunctionDefinition()` return empty
>>> suppose the module is loaded but not enabled?
>>>
>>> Best,
>>> Jane
>>>
>>> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]> wrote:
>>>
>>>> +1 to Jark's proposal
>>>>
>>>> I like the difference between just loading and actually enabling these
>>>> modules.
>>>>
>>>> @Rui: I would use the same behavior as catalogs here. You cannot `USE` a
>>>> catalog without creating it before.
>>>>
>>>> Another question is whether a LOAD operation also adds the module to the
>>>> enabled list by default?
>>>>
>>>> Regards,
>>>> Timo
>>>>
>>>> On 01.02.21 13:52, Rui Li wrote:
>>>>> If `USE MODULES` implies unloading modules that are not listed, does it
>>>>> also imply loading modules that are not previously loaded, especially
>>>> since
>>>>> we're mapping modules by name now?
>>>>>
>>>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
>>>>>
>>>>>> I agree with Timo that the USE implies the specified modules are in
>> use
>>>> in
>>>>>> the specified order and others are not used.
>>>>>> This would be easier to know what's the result list and order after
>> the
>>>> USE
>>>>>> statement.
>>>>>> That means: if current modules in order are x, y, z. And `USE MODULES
>>>> z, y`
>>>>>> means current modules in order are z, y.
>>>>>>
>>>>>> But I would like to not unload the unmentioned modules in the USE
>>>>>> statement. Because it seems strange that USE
>>>>>> will implicitly remove modules. In the above example, the user may
>> type
>>>> the
>>>>>> wrong modules list using USE by mistake
>>>>>>     and would like to declare the list again, the user has to create
>> the
>>>>>> module again with some properties he may don't know. Therefore, I
>>>> propose
>>>>>> the USE statement just specifies the current module lists and doesn't
>>>>>> unload modules.
>>>>>> Besides that, we may need a new syntax to list all the modules
>> including
>>>>>> not used but loaded.
>>>>>> We can introduce SHOW FULL MODULES for this purpose with an additional
>>>>>> `used` column.
>>>>>>
>>>>>> For example:
>>>>>>
>>>>>> Flink SQL> list modules:
>>>>>> -----------
>>>>>> | modules |
>>>>>> -----------
>>>>>> | x       |
>>>>>> | y       |
>>>>>> | z       |
>>>>>> -----------
>>>>>> Flink SQL> USE MODULES z, y;
>>>>>> Flink SQL> show modules:
>>>>>> -----------
>>>>>> | modules |
>>>>>> -----------
>>>>>> | z       |
>>>>>> | y       |
>>>>>> -----------
>>>>>> Flink SQL> show FULL modules;
>>>>>> -------------------
>>>>>> | modules |  used |
>>>>>> -------------------
>>>>>> | z       | true  |
>>>>>> | y       | true  |
>>>>>> | x       | false |
>>>>>> -------------------
>>>>>> Flink SQL> USE MODULES z, y, x;
>>>>>> Flink SQL> show modules;
>>>>>> -----------
>>>>>> | modules |
>>>>>> -----------
>>>>>> | z       |
>>>>>> | y       |
>>>>>> | x       |
>>>>>> -----------
>>>>>>
>>>>>> What do you think?
>>>>>>
>>>>>> Best,
>>>>>> Jark
>>>>>>
>>>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]> wrote:
>>>>>>
>>>>>>> Hi Timo, thanks for the discussion.
>>>>>>>
>>>>>>> It seems to reach an agreement regarding #3 that <1> Module name
>> should
>>>>>>> better be a simple identifier rather than a string literal. <2>
>>>> Property
>>>>>>> `type` is redundant and should be removed, and mapping will rely on
>> the
>>>>>>> module name because loading a module multiple times just using a
>>>>>> different
>>>>>>> module name doesn't make much sense. <3> We should migrate to the
>> newer
>>>>>> API
>>>>>>> rather than the deprecated `TableFactory` class.
>>>>>>>
>>>>>>> Regarding #1, I think the point lies in whether changing the
>> resolution
>>>>>>> order implies an `unload` operation explicitly (i.e., users could
>> sense
>>>>>>> it). What do others think?
>>>>>>>
>>>>>>> Best,
>>>>>>> Jane
>>>>>>>
>>>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]>
>>>> wrote:
>>>>>>>
>>>>>>>> IMHO I would rather unload the not mentioned modules. The statement
>>>>>>>> expresses `USE` that implicilty implies that the other modules are
>>>> "not
>>>>>>>> used". What do others think?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Timo
>>>>>>>>
>>>>>>>>
>>>>>>>> On 01.02.21 11:28, Jane Chan wrote:
>>>>>>>>> Hi Jark and Rui,
>>>>>>>>>
>>>>>>>>> Thanks for the discussions.
>>>>>>>>>
>>>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
>>>>>>>>>
>>>>>>>>>> It can be interpreted as "setting the current order of modules",
>>>>>> which
>>>>>>>> is
>>>>>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
>>>>>>>>>>
>>>>>>>>> I would like to confirm that the unmentioned modules remain in the
>>>>>> same
>>>>>>>>> relative order? E.g., if there are three loaded modules `X`, `Y`,
>>>>>> `Z`,
>>>>>>>> then
>>>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
>>>>>>>>>
>>>>>>>>> Regarding #3, I'm fine with mapping modules purely by name, and I
>>>>>> think
>>>>>>>>> Jark raised a good point on making the module name a simple
>>>>>> identifier
>>>>>>>>> instead of a string literal. For backward compatibility, since we
>>>>>>> haven't
>>>>>>>>> supported this syntax yet, the affected users are those who defined
>>>>>>>> modules
>>>>>>>>> in the YAML configuration file. Maybe we can eliminate the 'type'
>>>>>> from
>>>>>>>> the
>>>>>>>>> 'requiredContext' to make it optional. Thus the proposed mapping
>>>>>>>> mechanism
>>>>>>>>> could use the module name to lookup the suitable factory,  and in
>> the
>>>>>>>>> meanwhile updating documentation to encourage users to simplify
>> their
>>>>>>>> YAML
>>>>>>>>> configuration. And in the long run, we can deprecate the 'type'.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Jane
>>>>>>>>>
>>>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]>
>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Jane for starting the discussion.
>>>>>>>>>>
>>>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
>>>>>>> interpreted
>>>>>>>> as
>>>>>>>>>> "setting the current order of modules", which is similar to
>> "setting
>>>>>>> the
>>>>>>>>>> current catalog" for `USE CATALOG`.
>>>>>>>>>>
>>>>>>>>>> Regarding #3, I'm fine to map modules purely by name because I
>> think
>>>>>>> it
>>>>>>>>>> satisfies all the use cases we have at hand. But I guess we need
>> to
>>>>>>> make
>>>>>>>>>> sure we're backward compatible, i.e. users don't need to change
>>>>>> their
>>>>>>>> yaml
>>>>>>>>>> files to configure the modules.
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks Jane for the summary and starting the discussion in the
>>>>>>> mailing
>>>>>>>>>>> list.
>>>>>>>>>>>
>>>>>>>>>>> Here are my thoughts:
>>>>>>>>>>>
>>>>>>>>>>> 1) syntax to reorder modules
>>>>>>>>>>> I agree with Rui Li it would be quite useful if we can have some
>>>>>>> syntax
>>>>>>>>>> to
>>>>>>>>>>> reorder modules.
>>>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x,
>> y,
>>>>>>> z`,
>>>>>>>>>>> because USE has a more sense of effective and specifying
>> ordering,
>>>>>>> than
>>>>>>>>>>> RELOAD.
>>>>>>>>>>>     From my feeling, RELOAD just means we unregister and register
>>>>>> x,y,z
>>>>>>>>>> modules
>>>>>>>>>>> again,
>>>>>>>>>>> it sounds like other registered modules are still in use and in
>> the
>>>>>>>>>> order.
>>>>>>>>>>>
>>>>>>>>>>> 3) mapping modules purely by name
>>>>>>>>>>> This can definitely improve the usability of loading modules,
>>>>>> because
>>>>>>>>>>> the 'type=' property
>>>>>>>>>>> looks really redundant. We can think of this as a syntax sugar
>> that
>>>>>>> the
>>>>>>>>>>> default type value is the module name.
>>>>>>>>>>> And we can support to specify 'type=' property in the future to
>>>>>> allow
>>>>>>>>>>> multiple modules for one module type.
>>>>>>>>>>>
>>>>>>>>>>> Besides, I would like to mention one more change, that the module
>>>>>>> name
>>>>>>>>>>> proposed in FLIP-68 is a string literal.
>>>>>>>>>>> But I think we are all on the same page to change it into a
>> simple
>>>>>>>>>>> (non-compound) identifier.
>>>>>>>>>>>
>>>>>>>>>>> LOAD/UNLOAD MODULE 'core'
>>>>>>>>>>> ==>
>>>>>>>>>>> LOAD/UNLOAD MODULE core
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Jark
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]>
>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>
>>>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1] about
>>>>>>> supporting
>>>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first
>> proposed
>>>>>> by
>>>>>>>>>>>> FLIP-68 [2] as following.
>>>>>>>>>>>>
>>>>>>>>>>>> -- load a module with the given name and append it to the end of
>>>>>> the
>>>>>>>>>>> module
>>>>>>>>>>>> list
>>>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
>>>>>>>>>>>>
>>>>>>>>>>>> --unload a module by name from the module list and other modules
>>>>>>>> remain
>>>>>>>>>>> in
>>>>>>>>>>>> the same relative positions
>>>>>>>>>>>> UNLOAD MODULE 'name'
>>>>>>>>>>>>
>>>>>>>>>>>> After a round of discussion on the Jira ticket, it seems some
>>>>>>>>>> unanswered
>>>>>>>>>>>> questions need more opinions and suggestions.
>>>>>>>>>>>>
>>>>>>>>>>>> 1. The way to redefine resolution order easily
>>>>>>>>>>>>
>>>>>>>>>>>>         Rui Li suggested introducing `USE MODULES` and adding
>>>> similar
>>>>>>>>>>>> functionality to the API because
>>>>>>>>>>>>
>>>>>>>>>>>>>      1) It's very tedious to unload old modules just to reorder
>>>>>> them.
>>>>>>>>>>>>
>>>>>>>>>>>>      2) Users may not even know how to "re-load" an old module
>> if it
>>>>>>> was
>>>>>>>>>> not
>>>>>>>>>>>>> initially loaded by the user, e.g. don't know which type to
>> use.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>         Jane Chan wondered that module is not like the catalog
>> which
>>>>>>> has
>>>>>>>> a
>>>>>>>>>>>> concept of namespace could specify, and `USE` sounds like a
>>>>>>>>>>>> mutual-exclusive concept.
>>>>>>>>>>>>         Maybe `RELOAD MODULES` can express upgrading the
>> priority of
>>>>>>> the
>>>>>>>>>>> loaded
>>>>>>>>>>>> module(s).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
>>>>>>>>>>>>         Jark Wu and Nicholas Jiang proposed to use `CREATE/DROP
>>>>>> MODULE`
>>>>>>>>>>> instead
>>>>>>>>>>>> of `LOAD/UNLOAD MODULE` because
>>>>>>>>>>>>
>>>>>>>>>>>>>      1) From a pure SQL user's perspective, maybe `CREATE
>> MODULE +
>>>>>> USE
>>>>>>>>>>>> MODULE`
>>>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
>>>>>>>>>>>>>      2) This will be very similar to what the catalog used now.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>       Timo Walther would rather stick to the agreed design
>> because
>>>>>>>>>>>> loading/unloading modules is a concept known from kernels etc.
>>>>>>>>>>>>
>>>>>>>>>>>> 3. Simplify the module design by mapping modules purely by name
>>>>>>>>>>>>
>>>>>>>>>>>> LOAD MODULE geo_utils
>>>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
>>>>>>>>>>> 'type='/'module='
>>>>>>>>>>>> but allow only 1 module to be loaded parameterized
>>>>>>>>>>>> UNLOAD hive
>>>>>>>>>>>> USE MODULES hive, core
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Please find more details in the reference link. Looking forward
>> to
>>>>>>>> your
>>>>>>>>>>>> feedback.
>>>>>>>>>>>>
>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
>>>>>>>>>>>> <
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>>>>>>>>>
>>>>>>>>>>>> [2]
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Jane
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Best regards!
>>>>>>>>>> Rui Li
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jane Chan
Hi everyone,

Thanks for the discussion to make this improvement plan clearer.

Hi, @Jark, @Rui, and @Timo, I'm collecting the final discussion summaries
now and want to confirm one thing that for the statement `USE MODULES x [,
y, z, ...]`, if the module name list contains an unexsited module, shall we
#1 fail the execution for all of them or #2 enabled the rest modules and
return a warning to users? My personal preference goes to #1 for
simplicity. What do you think?

Best,
Jane

On Tue, Feb 2, 2021 at 3:53 PM Timo Walther <[hidden email]> wrote:

> +1
>
> @Jane Can you summarize our discussion in the JIRA issue?
>
> Thanks,
> Timo
>
>
> On 02.02.21 03:50, Jark Wu wrote:
> > Hi Timo,
> >
> >> Another question is whether a LOAD operation also adds the module to the
> > enabled list by default?
> >
> > I would like to add the module to the enabled list by default, the main
> > reasons are:
> > 1) Reordering is an advanced requirement, adding modules needs additional
> > USE statements with "core" module
> >   sounds too burdensome. Most users should be satisfied with only LOAD
> > statements.
> > 2) We should keep compatible for TableEnvironment#loadModule().
> > 3) We are using the LOAD statement instead of CREATE, so I think it's
> fine
> > that it does some implicit things.
> >
> > Best,
> > Jark
> >
> > On Tue, 2 Feb 2021 at 00:48, Timo Walther <[hidden email]> wrote:
> >
> >> Not the module itself but the ModuleManager should handle this case,
> yes.
> >>
> >> Regards,
> >> Timo
> >>
> >>
> >> On 01.02.21 17:35, Jane Chan wrote:
> >>> +1 to Jark's proposal
> >>>
> >>>    To make it clearer,  will `module#getFunctionDefinition()` return
> empty
> >>> suppose the module is loaded but not enabled?
> >>>
> >>> Best,
> >>> Jane
> >>>
> >>> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]>
> wrote:
> >>>
> >>>> +1 to Jark's proposal
> >>>>
> >>>> I like the difference between just loading and actually enabling these
> >>>> modules.
> >>>>
> >>>> @Rui: I would use the same behavior as catalogs here. You cannot
> `USE` a
> >>>> catalog without creating it before.
> >>>>
> >>>> Another question is whether a LOAD operation also adds the module to
> the
> >>>> enabled list by default?
> >>>>
> >>>> Regards,
> >>>> Timo
> >>>>
> >>>> On 01.02.21 13:52, Rui Li wrote:
> >>>>> If `USE MODULES` implies unloading modules that are not listed, does
> it
> >>>>> also imply loading modules that are not previously loaded, especially
> >>>> since
> >>>>> we're mapping modules by name now?
> >>>>>
> >>>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
> >>>>>
> >>>>>> I agree with Timo that the USE implies the specified modules are in
> >> use
> >>>> in
> >>>>>> the specified order and others are not used.
> >>>>>> This would be easier to know what's the result list and order after
> >> the
> >>>> USE
> >>>>>> statement.
> >>>>>> That means: if current modules in order are x, y, z. And `USE
> MODULES
> >>>> z, y`
> >>>>>> means current modules in order are z, y.
> >>>>>>
> >>>>>> But I would like to not unload the unmentioned modules in the USE
> >>>>>> statement. Because it seems strange that USE
> >>>>>> will implicitly remove modules. In the above example, the user may
> >> type
> >>>> the
> >>>>>> wrong modules list using USE by mistake
> >>>>>>     and would like to declare the list again, the user has to create
> >> the
> >>>>>> module again with some properties he may don't know. Therefore, I
> >>>> propose
> >>>>>> the USE statement just specifies the current module lists and
> doesn't
> >>>>>> unload modules.
> >>>>>> Besides that, we may need a new syntax to list all the modules
> >> including
> >>>>>> not used but loaded.
> >>>>>> We can introduce SHOW FULL MODULES for this purpose with an
> additional
> >>>>>> `used` column.
> >>>>>>
> >>>>>> For example:
> >>>>>>
> >>>>>> Flink SQL> list modules:
> >>>>>> -----------
> >>>>>> | modules |
> >>>>>> -----------
> >>>>>> | x       |
> >>>>>> | y       |
> >>>>>> | z       |
> >>>>>> -----------
> >>>>>> Flink SQL> USE MODULES z, y;
> >>>>>> Flink SQL> show modules:
> >>>>>> -----------
> >>>>>> | modules |
> >>>>>> -----------
> >>>>>> | z       |
> >>>>>> | y       |
> >>>>>> -----------
> >>>>>> Flink SQL> show FULL modules;
> >>>>>> -------------------
> >>>>>> | modules |  used |
> >>>>>> -------------------
> >>>>>> | z       | true  |
> >>>>>> | y       | true  |
> >>>>>> | x       | false |
> >>>>>> -------------------
> >>>>>> Flink SQL> USE MODULES z, y, x;
> >>>>>> Flink SQL> show modules;
> >>>>>> -----------
> >>>>>> | modules |
> >>>>>> -----------
> >>>>>> | z       |
> >>>>>> | y       |
> >>>>>> | x       |
> >>>>>> -----------
> >>>>>>
> >>>>>> What do you think?
> >>>>>>
> >>>>>> Best,
> >>>>>> Jark
> >>>>>>
> >>>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]>
> wrote:
> >>>>>>
> >>>>>>> Hi Timo, thanks for the discussion.
> >>>>>>>
> >>>>>>> It seems to reach an agreement regarding #3 that <1> Module name
> >> should
> >>>>>>> better be a simple identifier rather than a string literal. <2>
> >>>> Property
> >>>>>>> `type` is redundant and should be removed, and mapping will rely on
> >> the
> >>>>>>> module name because loading a module multiple times just using a
> >>>>>> different
> >>>>>>> module name doesn't make much sense. <3> We should migrate to the
> >> newer
> >>>>>> API
> >>>>>>> rather than the deprecated `TableFactory` class.
> >>>>>>>
> >>>>>>> Regarding #1, I think the point lies in whether changing the
> >> resolution
> >>>>>>> order implies an `unload` operation explicitly (i.e., users could
> >> sense
> >>>>>>> it). What do others think?
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Jane
> >>>>>>>
> >>>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> IMHO I would rather unload the not mentioned modules. The
> statement
> >>>>>>>> expresses `USE` that implicilty implies that the other modules are
> >>>> "not
> >>>>>>>> used". What do others think?
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Timo
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 01.02.21 11:28, Jane Chan wrote:
> >>>>>>>>> Hi Jark and Rui,
> >>>>>>>>>
> >>>>>>>>> Thanks for the discussions.
> >>>>>>>>>
> >>>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
> >>>>>>>>>
> >>>>>>>>>> It can be interpreted as "setting the current order of modules",
> >>>>>> which
> >>>>>>>> is
> >>>>>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
> >>>>>>>>>>
> >>>>>>>>> I would like to confirm that the unmentioned modules remain in
> the
> >>>>>> same
> >>>>>>>>> relative order? E.g., if there are three loaded modules `X`, `Y`,
> >>>>>> `Z`,
> >>>>>>>> then
> >>>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> >>>>>>>>>
> >>>>>>>>> Regarding #3, I'm fine with mapping modules purely by name, and I
> >>>>>> think
> >>>>>>>>> Jark raised a good point on making the module name a simple
> >>>>>> identifier
> >>>>>>>>> instead of a string literal. For backward compatibility, since we
> >>>>>>> haven't
> >>>>>>>>> supported this syntax yet, the affected users are those who
> defined
> >>>>>>>> modules
> >>>>>>>>> in the YAML configuration file. Maybe we can eliminate the 'type'
> >>>>>> from
> >>>>>>>> the
> >>>>>>>>> 'requiredContext' to make it optional. Thus the proposed mapping
> >>>>>>>> mechanism
> >>>>>>>>> could use the module name to lookup the suitable factory,  and in
> >> the
> >>>>>>>>> meanwhile updating documentation to encourage users to simplify
> >> their
> >>>>>>>> YAML
> >>>>>>>>> configuration. And in the long run, we can deprecate the 'type'.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Jane
> >>>>>>>>>
> >>>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]>
> >> wrote:
> >>>>>>>>>
> >>>>>>>>>> Thanks Jane for starting the discussion.
> >>>>>>>>>>
> >>>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
> >>>>>>> interpreted
> >>>>>>>> as
> >>>>>>>>>> "setting the current order of modules", which is similar to
> >> "setting
> >>>>>>> the
> >>>>>>>>>> current catalog" for `USE CATALOG`.
> >>>>>>>>>>
> >>>>>>>>>> Regarding #3, I'm fine to map modules purely by name because I
> >> think
> >>>>>>> it
> >>>>>>>>>> satisfies all the use cases we have at hand. But I guess we need
> >> to
> >>>>>>> make
> >>>>>>>>>> sure we're backward compatible, i.e. users don't need to change
> >>>>>> their
> >>>>>>>> yaml
> >>>>>>>>>> files to configure the modules.
> >>>>>>>>>>
> >>>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]>
> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks Jane for the summary and starting the discussion in the
> >>>>>>> mailing
> >>>>>>>>>>> list.
> >>>>>>>>>>>
> >>>>>>>>>>> Here are my thoughts:
> >>>>>>>>>>>
> >>>>>>>>>>> 1) syntax to reorder modules
> >>>>>>>>>>> I agree with Rui Li it would be quite useful if we can have
> some
> >>>>>>> syntax
> >>>>>>>>>> to
> >>>>>>>>>>> reorder modules.
> >>>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES x,
> >> y,
> >>>>>>> z`,
> >>>>>>>>>>> because USE has a more sense of effective and specifying
> >> ordering,
> >>>>>>> than
> >>>>>>>>>>> RELOAD.
> >>>>>>>>>>>     From my feeling, RELOAD just means we unregister and
> register
> >>>>>> x,y,z
> >>>>>>>>>> modules
> >>>>>>>>>>> again,
> >>>>>>>>>>> it sounds like other registered modules are still in use and in
> >> the
> >>>>>>>>>> order.
> >>>>>>>>>>>
> >>>>>>>>>>> 3) mapping modules purely by name
> >>>>>>>>>>> This can definitely improve the usability of loading modules,
> >>>>>> because
> >>>>>>>>>>> the 'type=' property
> >>>>>>>>>>> looks really redundant. We can think of this as a syntax sugar
> >> that
> >>>>>>> the
> >>>>>>>>>>> default type value is the module name.
> >>>>>>>>>>> And we can support to specify 'type=' property in the future to
> >>>>>> allow
> >>>>>>>>>>> multiple modules for one module type.
> >>>>>>>>>>>
> >>>>>>>>>>> Besides, I would like to mention one more change, that the
> module
> >>>>>>> name
> >>>>>>>>>>> proposed in FLIP-68 is a string literal.
> >>>>>>>>>>> But I think we are all on the same page to change it into a
> >> simple
> >>>>>>>>>>> (non-compound) identifier.
> >>>>>>>>>>>
> >>>>>>>>>>> LOAD/UNLOAD MODULE 'core'
> >>>>>>>>>>> ==>
> >>>>>>>>>>> LOAD/UNLOAD MODULE core
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Jark
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <[hidden email]
> >
> >>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1] about
> >>>>>>> supporting
> >>>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first
> >> proposed
> >>>>>> by
> >>>>>>>>>>>> FLIP-68 [2] as following.
> >>>>>>>>>>>>
> >>>>>>>>>>>> -- load a module with the given name and append it to the end
> of
> >>>>>> the
> >>>>>>>>>>> module
> >>>>>>>>>>>> list
> >>>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp', ...)]
> >>>>>>>>>>>>
> >>>>>>>>>>>> --unload a module by name from the module list and other
> modules
> >>>>>>>> remain
> >>>>>>>>>>> in
> >>>>>>>>>>>> the same relative positions
> >>>>>>>>>>>> UNLOAD MODULE 'name'
> >>>>>>>>>>>>
> >>>>>>>>>>>> After a round of discussion on the Jira ticket, it seems some
> >>>>>>>>>> unanswered
> >>>>>>>>>>>> questions need more opinions and suggestions.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. The way to redefine resolution order easily
> >>>>>>>>>>>>
> >>>>>>>>>>>>         Rui Li suggested introducing `USE MODULES` and adding
> >>>> similar
> >>>>>>>>>>>> functionality to the API because
> >>>>>>>>>>>>
> >>>>>>>>>>>>>      1) It's very tedious to unload old modules just to
> reorder
> >>>>>> them.
> >>>>>>>>>>>>
> >>>>>>>>>>>>      2) Users may not even know how to "re-load" an old module
> >> if it
> >>>>>>> was
> >>>>>>>>>> not
> >>>>>>>>>>>>> initially loaded by the user, e.g. don't know which type to
> >> use.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>         Jane Chan wondered that module is not like the catalog
> >> which
> >>>>>>> has
> >>>>>>>> a
> >>>>>>>>>>>> concept of namespace could specify, and `USE` sounds like a
> >>>>>>>>>>>> mutual-exclusive concept.
> >>>>>>>>>>>>         Maybe `RELOAD MODULES` can express upgrading the
> >> priority of
> >>>>>>> the
> >>>>>>>>>>> loaded
> >>>>>>>>>>>> module(s).
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> >>>>>>>>>>>>         Jark Wu and Nicholas Jiang proposed to use
> `CREATE/DROP
> >>>>>> MODULE`
> >>>>>>>>>>> instead
> >>>>>>>>>>>> of `LOAD/UNLOAD MODULE` because
> >>>>>>>>>>>>
> >>>>>>>>>>>>>      1) From a pure SQL user's perspective, maybe `CREATE
> >> MODULE +
> >>>>>> USE
> >>>>>>>>>>>> MODULE`
> >>>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
> >>>>>>>>>>>>>      2) This will be very similar to what the catalog used
> now.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>       Timo Walther would rather stick to the agreed design
> >> because
> >>>>>>>>>>>> loading/unloading modules is a concept known from kernels etc.
> >>>>>>>>>>>>
> >>>>>>>>>>>> 3. Simplify the module design by mapping modules purely by
> name
> >>>>>>>>>>>>
> >>>>>>>>>>>> LOAD MODULE geo_utils
> >>>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> >>>>>>>>>>> 'type='/'module='
> >>>>>>>>>>>> but allow only 1 module to be loaded parameterized
> >>>>>>>>>>>> UNLOAD hive
> >>>>>>>>>>>> USE MODULES hive, core
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please find more details in the reference link. Looking
> forward
> >> to
> >>>>>>>> your
> >>>>>>>>>>>> feedback.
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> >>>>>>>>>>>> <
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>>>>>>>>>
> >>>>>>>>>>>> [2]
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Jane
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best regards!
> >>>>>>>>>> Rui Li
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jark Wu-2
Hi Jane,

Yes. I think we should fail fast.

Best,
Jark

On Wed, 3 Feb 2021 at 12:06, Jane Chan <[hidden email]> wrote:

> Hi everyone,
>
> Thanks for the discussion to make this improvement plan clearer.
>
> Hi, @Jark, @Rui, and @Timo, I'm collecting the final discussion summaries
> now and want to confirm one thing that for the statement `USE MODULES x [,
> y, z, ...]`, if the module name list contains an unexsited module, shall we
> #1 fail the execution for all of them or #2 enabled the rest modules and
> return a warning to users? My personal preference goes to #1 for
> simplicity. What do you think?
>
> Best,
> Jane
>
> On Tue, Feb 2, 2021 at 3:53 PM Timo Walther <[hidden email]> wrote:
>
> > +1
> >
> > @Jane Can you summarize our discussion in the JIRA issue?
> >
> > Thanks,
> > Timo
> >
> >
> > On 02.02.21 03:50, Jark Wu wrote:
> > > Hi Timo,
> > >
> > >> Another question is whether a LOAD operation also adds the module to
> the
> > > enabled list by default?
> > >
> > > I would like to add the module to the enabled list by default, the main
> > > reasons are:
> > > 1) Reordering is an advanced requirement, adding modules needs
> additional
> > > USE statements with "core" module
> > >   sounds too burdensome. Most users should be satisfied with only LOAD
> > > statements.
> > > 2) We should keep compatible for TableEnvironment#loadModule().
> > > 3) We are using the LOAD statement instead of CREATE, so I think it's
> > fine
> > > that it does some implicit things.
> > >
> > > Best,
> > > Jark
> > >
> > > On Tue, 2 Feb 2021 at 00:48, Timo Walther <[hidden email]> wrote:
> > >
> > >> Not the module itself but the ModuleManager should handle this case,
> > yes.
> > >>
> > >> Regards,
> > >> Timo
> > >>
> > >>
> > >> On 01.02.21 17:35, Jane Chan wrote:
> > >>> +1 to Jark's proposal
> > >>>
> > >>>    To make it clearer,  will `module#getFunctionDefinition()` return
> > empty
> > >>> suppose the module is loaded but not enabled?
> > >>>
> > >>> Best,
> > >>> Jane
> > >>>
> > >>> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]>
> > wrote:
> > >>>
> > >>>> +1 to Jark's proposal
> > >>>>
> > >>>> I like the difference between just loading and actually enabling
> these
> > >>>> modules.
> > >>>>
> > >>>> @Rui: I would use the same behavior as catalogs here. You cannot
> > `USE` a
> > >>>> catalog without creating it before.
> > >>>>
> > >>>> Another question is whether a LOAD operation also adds the module to
> > the
> > >>>> enabled list by default?
> > >>>>
> > >>>> Regards,
> > >>>> Timo
> > >>>>
> > >>>> On 01.02.21 13:52, Rui Li wrote:
> > >>>>> If `USE MODULES` implies unloading modules that are not listed,
> does
> > it
> > >>>>> also imply loading modules that are not previously loaded,
> especially
> > >>>> since
> > >>>>> we're mapping modules by name now?
> > >>>>>
> > >>>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
> > >>>>>
> > >>>>>> I agree with Timo that the USE implies the specified modules are
> in
> > >> use
> > >>>> in
> > >>>>>> the specified order and others are not used.
> > >>>>>> This would be easier to know what's the result list and order
> after
> > >> the
> > >>>> USE
> > >>>>>> statement.
> > >>>>>> That means: if current modules in order are x, y, z. And `USE
> > MODULES
> > >>>> z, y`
> > >>>>>> means current modules in order are z, y.
> > >>>>>>
> > >>>>>> But I would like to not unload the unmentioned modules in the USE
> > >>>>>> statement. Because it seems strange that USE
> > >>>>>> will implicitly remove modules. In the above example, the user may
> > >> type
> > >>>> the
> > >>>>>> wrong modules list using USE by mistake
> > >>>>>>     and would like to declare the list again, the user has to
> create
> > >> the
> > >>>>>> module again with some properties he may don't know. Therefore, I
> > >>>> propose
> > >>>>>> the USE statement just specifies the current module lists and
> > doesn't
> > >>>>>> unload modules.
> > >>>>>> Besides that, we may need a new syntax to list all the modules
> > >> including
> > >>>>>> not used but loaded.
> > >>>>>> We can introduce SHOW FULL MODULES for this purpose with an
> > additional
> > >>>>>> `used` column.
> > >>>>>>
> > >>>>>> For example:
> > >>>>>>
> > >>>>>> Flink SQL> list modules:
> > >>>>>> -----------
> > >>>>>> | modules |
> > >>>>>> -----------
> > >>>>>> | x       |
> > >>>>>> | y       |
> > >>>>>> | z       |
> > >>>>>> -----------
> > >>>>>> Flink SQL> USE MODULES z, y;
> > >>>>>> Flink SQL> show modules:
> > >>>>>> -----------
> > >>>>>> | modules |
> > >>>>>> -----------
> > >>>>>> | z       |
> > >>>>>> | y       |
> > >>>>>> -----------
> > >>>>>> Flink SQL> show FULL modules;
> > >>>>>> -------------------
> > >>>>>> | modules |  used |
> > >>>>>> -------------------
> > >>>>>> | z       | true  |
> > >>>>>> | y       | true  |
> > >>>>>> | x       | false |
> > >>>>>> -------------------
> > >>>>>> Flink SQL> USE MODULES z, y, x;
> > >>>>>> Flink SQL> show modules;
> > >>>>>> -----------
> > >>>>>> | modules |
> > >>>>>> -----------
> > >>>>>> | z       |
> > >>>>>> | y       |
> > >>>>>> | x       |
> > >>>>>> -----------
> > >>>>>>
> > >>>>>> What do you think?
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Jark
> > >>>>>>
> > >>>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]>
> > wrote:
> > >>>>>>
> > >>>>>>> Hi Timo, thanks for the discussion.
> > >>>>>>>
> > >>>>>>> It seems to reach an agreement regarding #3 that <1> Module name
> > >> should
> > >>>>>>> better be a simple identifier rather than a string literal. <2>
> > >>>> Property
> > >>>>>>> `type` is redundant and should be removed, and mapping will rely
> on
> > >> the
> > >>>>>>> module name because loading a module multiple times just using a
> > >>>>>> different
> > >>>>>>> module name doesn't make much sense. <3> We should migrate to the
> > >> newer
> > >>>>>> API
> > >>>>>>> rather than the deprecated `TableFactory` class.
> > >>>>>>>
> > >>>>>>> Regarding #1, I think the point lies in whether changing the
> > >> resolution
> > >>>>>>> order implies an `unload` operation explicitly (i.e., users could
> > >> sense
> > >>>>>>> it). What do others think?
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Jane
> > >>>>>>>
> > >>>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <[hidden email]>
> > >>>> wrote:
> > >>>>>>>
> > >>>>>>>> IMHO I would rather unload the not mentioned modules. The
> > statement
> > >>>>>>>> expresses `USE` that implicilty implies that the other modules
> are
> > >>>> "not
> > >>>>>>>> used". What do others think?
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Timo
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On 01.02.21 11:28, Jane Chan wrote:
> > >>>>>>>>> Hi Jark and Rui,
> > >>>>>>>>>
> > >>>>>>>>> Thanks for the discussions.
> > >>>>>>>>>
> > >>>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
> > >>>>>>>>>
> > >>>>>>>>>> It can be interpreted as "setting the current order of
> modules",
> > >>>>>> which
> > >>>>>>>> is
> > >>>>>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
> > >>>>>>>>>>
> > >>>>>>>>> I would like to confirm that the unmentioned modules remain in
> > the
> > >>>>>> same
> > >>>>>>>>> relative order? E.g., if there are three loaded modules `X`,
> `Y`,
> > >>>>>> `Z`,
> > >>>>>>>> then
> > >>>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> > >>>>>>>>>
> > >>>>>>>>> Regarding #3, I'm fine with mapping modules purely by name,
> and I
> > >>>>>> think
> > >>>>>>>>> Jark raised a good point on making the module name a simple
> > >>>>>> identifier
> > >>>>>>>>> instead of a string literal. For backward compatibility, since
> we
> > >>>>>>> haven't
> > >>>>>>>>> supported this syntax yet, the affected users are those who
> > defined
> > >>>>>>>> modules
> > >>>>>>>>> in the YAML configuration file. Maybe we can eliminate the
> 'type'
> > >>>>>> from
> > >>>>>>>> the
> > >>>>>>>>> 'requiredContext' to make it optional. Thus the proposed
> mapping
> > >>>>>>>> mechanism
> > >>>>>>>>> could use the module name to lookup the suitable factory,  and
> in
> > >> the
> > >>>>>>>>> meanwhile updating documentation to encourage users to simplify
> > >> their
> > >>>>>>>> YAML
> > >>>>>>>>> configuration. And in the long run, we can deprecate the
> 'type'.
> > >>>>>>>>>
> > >>>>>>>>> Best,
> > >>>>>>>>> Jane
> > >>>>>>>>>
> > >>>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]>
> > >> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Thanks Jane for starting the discussion.
> > >>>>>>>>>>
> > >>>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
> > >>>>>>> interpreted
> > >>>>>>>> as
> > >>>>>>>>>> "setting the current order of modules", which is similar to
> > >> "setting
> > >>>>>>> the
> > >>>>>>>>>> current catalog" for `USE CATALOG`.
> > >>>>>>>>>>
> > >>>>>>>>>> Regarding #3, I'm fine to map modules purely by name because I
> > >> think
> > >>>>>>> it
> > >>>>>>>>>> satisfies all the use cases we have at hand. But I guess we
> need
> > >> to
> > >>>>>>> make
> > >>>>>>>>>> sure we're backward compatible, i.e. users don't need to
> change
> > >>>>>> their
> > >>>>>>>> yaml
> > >>>>>>>>>> files to configure the modules.
> > >>>>>>>>>>
> > >>>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]>
> > wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Thanks Jane for the summary and starting the discussion in
> the
> > >>>>>>> mailing
> > >>>>>>>>>>> list.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Here are my thoughts:
> > >>>>>>>>>>>
> > >>>>>>>>>>> 1) syntax to reorder modules
> > >>>>>>>>>>> I agree with Rui Li it would be quite useful if we can have
> > some
> > >>>>>>> syntax
> > >>>>>>>>>> to
> > >>>>>>>>>>> reorder modules.
> > >>>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD MODULES
> x,
> > >> y,
> > >>>>>>> z`,
> > >>>>>>>>>>> because USE has a more sense of effective and specifying
> > >> ordering,
> > >>>>>>> than
> > >>>>>>>>>>> RELOAD.
> > >>>>>>>>>>>     From my feeling, RELOAD just means we unregister and
> > register
> > >>>>>> x,y,z
> > >>>>>>>>>> modules
> > >>>>>>>>>>> again,
> > >>>>>>>>>>> it sounds like other registered modules are still in use and
> in
> > >> the
> > >>>>>>>>>> order.
> > >>>>>>>>>>>
> > >>>>>>>>>>> 3) mapping modules purely by name
> > >>>>>>>>>>> This can definitely improve the usability of loading modules,
> > >>>>>> because
> > >>>>>>>>>>> the 'type=' property
> > >>>>>>>>>>> looks really redundant. We can think of this as a syntax
> sugar
> > >> that
> > >>>>>>> the
> > >>>>>>>>>>> default type value is the module name.
> > >>>>>>>>>>> And we can support to specify 'type=' property in the future
> to
> > >>>>>> allow
> > >>>>>>>>>>> multiple modules for one module type.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Besides, I would like to mention one more change, that the
> > module
> > >>>>>>> name
> > >>>>>>>>>>> proposed in FLIP-68 is a string literal.
> > >>>>>>>>>>> But I think we are all on the same page to change it into a
> > >> simple
> > >>>>>>>>>>> (non-compound) identifier.
> > >>>>>>>>>>>
> > >>>>>>>>>>> LOAD/UNLOAD MODULE 'core'
> > >>>>>>>>>>> ==>
> > >>>>>>>>>>> LOAD/UNLOAD MODULE core
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>> Jark
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <
> [hidden email]
> > >
> > >>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1] about
> > >>>>>>> supporting
> > >>>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first
> > >> proposed
> > >>>>>> by
> > >>>>>>>>>>>> FLIP-68 [2] as following.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> -- load a module with the given name and append it to the
> end
> > of
> > >>>>>> the
> > >>>>>>>>>>> module
> > >>>>>>>>>>>> list
> > >>>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp',
> ...)]
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> --unload a module by name from the module list and other
> > modules
> > >>>>>>>> remain
> > >>>>>>>>>>> in
> > >>>>>>>>>>>> the same relative positions
> > >>>>>>>>>>>> UNLOAD MODULE 'name'
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> After a round of discussion on the Jira ticket, it seems
> some
> > >>>>>>>>>> unanswered
> > >>>>>>>>>>>> questions need more opinions and suggestions.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 1. The way to redefine resolution order easily
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>         Rui Li suggested introducing `USE MODULES` and
> adding
> > >>>> similar
> > >>>>>>>>>>>> functionality to the API because
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>      1) It's very tedious to unload old modules just to
> > reorder
> > >>>>>> them.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>      2) Users may not even know how to "re-load" an old
> module
> > >> if it
> > >>>>>>> was
> > >>>>>>>>>> not
> > >>>>>>>>>>>>> initially loaded by the user, e.g. don't know which type to
> > >> use.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>         Jane Chan wondered that module is not like the
> catalog
> > >> which
> > >>>>>>> has
> > >>>>>>>> a
> > >>>>>>>>>>>> concept of namespace could specify, and `USE` sounds like a
> > >>>>>>>>>>>> mutual-exclusive concept.
> > >>>>>>>>>>>>         Maybe `RELOAD MODULES` can express upgrading the
> > >> priority of
> > >>>>>>> the
> > >>>>>>>>>>> loaded
> > >>>>>>>>>>>> module(s).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> > >>>>>>>>>>>>         Jark Wu and Nicholas Jiang proposed to use
> > `CREATE/DROP
> > >>>>>> MODULE`
> > >>>>>>>>>>> instead
> > >>>>>>>>>>>> of `LOAD/UNLOAD MODULE` because
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>>      1) From a pure SQL user's perspective, maybe `CREATE
> > >> MODULE +
> > >>>>>> USE
> > >>>>>>>>>>>> MODULE`
> > >>>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
> > >>>>>>>>>>>>>      2) This will be very similar to what the catalog used
> > now.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>       Timo Walther would rather stick to the agreed design
> > >> because
> > >>>>>>>>>>>> loading/unloading modules is a concept known from kernels
> etc.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> 3. Simplify the module design by mapping modules purely by
> > name
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> LOAD MODULE geo_utils
> > >>>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> > >>>>>>>>>>> 'type='/'module='
> > >>>>>>>>>>>> but allow only 1 module to be loaded parameterized
> > >>>>>>>>>>>> UNLOAD hive
> > >>>>>>>>>>>> USE MODULES hive, core
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Please find more details in the reference link. Looking
> > forward
> > >> to
> > >>>>>>>> your
> > >>>>>>>>>>>> feedback.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> > >>>>>>>>>>>> <
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>> [2]
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>> Jane
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> --
> > >>>>>>>>>> Best regards!
> > >>>>>>>>>> Rui Li
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jane Chan
Hi everyone,

I did a summary on the Jira issue page [1] since the discussion has
achieved a consensus. If there is anything missed or not corrected, please
let me know.

[1] https://issues.apache.org/jira/browse/FLINK-21045#

Best,
Jane





On Wed, Feb 3, 2021 at 1:33 PM Jark Wu <[hidden email]> wrote:

> Hi Jane,
>
> Yes. I think we should fail fast.
>
> Best,
> Jark
>
> On Wed, 3 Feb 2021 at 12:06, Jane Chan <[hidden email]> wrote:
>
> > Hi everyone,
> >
> > Thanks for the discussion to make this improvement plan clearer.
> >
> > Hi, @Jark, @Rui, and @Timo, I'm collecting the final discussion summaries
> > now and want to confirm one thing that for the statement `USE MODULES x
> [,
> > y, z, ...]`, if the module name list contains an unexsited module, shall
> we
> > #1 fail the execution for all of them or #2 enabled the rest modules and
> > return a warning to users? My personal preference goes to #1 for
> > simplicity. What do you think?
> >
> > Best,
> > Jane
> >
> > On Tue, Feb 2, 2021 at 3:53 PM Timo Walther <[hidden email]> wrote:
> >
> > > +1
> > >
> > > @Jane Can you summarize our discussion in the JIRA issue?
> > >
> > > Thanks,
> > > Timo
> > >
> > >
> > > On 02.02.21 03:50, Jark Wu wrote:
> > > > Hi Timo,
> > > >
> > > >> Another question is whether a LOAD operation also adds the module to
> > the
> > > > enabled list by default?
> > > >
> > > > I would like to add the module to the enabled list by default, the
> main
> > > > reasons are:
> > > > 1) Reordering is an advanced requirement, adding modules needs
> > additional
> > > > USE statements with "core" module
> > > >   sounds too burdensome. Most users should be satisfied with only
> LOAD
> > > > statements.
> > > > 2) We should keep compatible for TableEnvironment#loadModule().
> > > > 3) We are using the LOAD statement instead of CREATE, so I think it's
> > > fine
> > > > that it does some implicit things.
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > On Tue, 2 Feb 2021 at 00:48, Timo Walther <[hidden email]>
> wrote:
> > > >
> > > >> Not the module itself but the ModuleManager should handle this case,
> > > yes.
> > > >>
> > > >> Regards,
> > > >> Timo
> > > >>
> > > >>
> > > >> On 01.02.21 17:35, Jane Chan wrote:
> > > >>> +1 to Jark's proposal
> > > >>>
> > > >>>    To make it clearer,  will `module#getFunctionDefinition()`
> return
> > > empty
> > > >>> suppose the module is loaded but not enabled?
> > > >>>
> > > >>> Best,
> > > >>> Jane
> > > >>>
> > > >>> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]>
> > > wrote:
> > > >>>
> > > >>>> +1 to Jark's proposal
> > > >>>>
> > > >>>> I like the difference between just loading and actually enabling
> > these
> > > >>>> modules.
> > > >>>>
> > > >>>> @Rui: I would use the same behavior as catalogs here. You cannot
> > > `USE` a
> > > >>>> catalog without creating it before.
> > > >>>>
> > > >>>> Another question is whether a LOAD operation also adds the module
> to
> > > the
> > > >>>> enabled list by default?
> > > >>>>
> > > >>>> Regards,
> > > >>>> Timo
> > > >>>>
> > > >>>> On 01.02.21 13:52, Rui Li wrote:
> > > >>>>> If `USE MODULES` implies unloading modules that are not listed,
> > does
> > > it
> > > >>>>> also imply loading modules that are not previously loaded,
> > especially
> > > >>>> since
> > > >>>>> we're mapping modules by name now?
> > > >>>>>
> > > >>>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
> > > >>>>>
> > > >>>>>> I agree with Timo that the USE implies the specified modules are
> > in
> > > >> use
> > > >>>> in
> > > >>>>>> the specified order and others are not used.
> > > >>>>>> This would be easier to know what's the result list and order
> > after
> > > >> the
> > > >>>> USE
> > > >>>>>> statement.
> > > >>>>>> That means: if current modules in order are x, y, z. And `USE
> > > MODULES
> > > >>>> z, y`
> > > >>>>>> means current modules in order are z, y.
> > > >>>>>>
> > > >>>>>> But I would like to not unload the unmentioned modules in the
> USE
> > > >>>>>> statement. Because it seems strange that USE
> > > >>>>>> will implicitly remove modules. In the above example, the user
> may
> > > >> type
> > > >>>> the
> > > >>>>>> wrong modules list using USE by mistake
> > > >>>>>>     and would like to declare the list again, the user has to
> > create
> > > >> the
> > > >>>>>> module again with some properties he may don't know. Therefore,
> I
> > > >>>> propose
> > > >>>>>> the USE statement just specifies the current module lists and
> > > doesn't
> > > >>>>>> unload modules.
> > > >>>>>> Besides that, we may need a new syntax to list all the modules
> > > >> including
> > > >>>>>> not used but loaded.
> > > >>>>>> We can introduce SHOW FULL MODULES for this purpose with an
> > > additional
> > > >>>>>> `used` column.
> > > >>>>>>
> > > >>>>>> For example:
> > > >>>>>>
> > > >>>>>> Flink SQL> list modules:
> > > >>>>>> -----------
> > > >>>>>> | modules |
> > > >>>>>> -----------
> > > >>>>>> | x       |
> > > >>>>>> | y       |
> > > >>>>>> | z       |
> > > >>>>>> -----------
> > > >>>>>> Flink SQL> USE MODULES z, y;
> > > >>>>>> Flink SQL> show modules:
> > > >>>>>> -----------
> > > >>>>>> | modules |
> > > >>>>>> -----------
> > > >>>>>> | z       |
> > > >>>>>> | y       |
> > > >>>>>> -----------
> > > >>>>>> Flink SQL> show FULL modules;
> > > >>>>>> -------------------
> > > >>>>>> | modules |  used |
> > > >>>>>> -------------------
> > > >>>>>> | z       | true  |
> > > >>>>>> | y       | true  |
> > > >>>>>> | x       | false |
> > > >>>>>> -------------------
> > > >>>>>> Flink SQL> USE MODULES z, y, x;
> > > >>>>>> Flink SQL> show modules;
> > > >>>>>> -----------
> > > >>>>>> | modules |
> > > >>>>>> -----------
> > > >>>>>> | z       |
> > > >>>>>> | y       |
> > > >>>>>> | x       |
> > > >>>>>> -----------
> > > >>>>>>
> > > >>>>>> What do you think?
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Jark
> > > >>>>>>
> > > >>>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]>
> > > wrote:
> > > >>>>>>
> > > >>>>>>> Hi Timo, thanks for the discussion.
> > > >>>>>>>
> > > >>>>>>> It seems to reach an agreement regarding #3 that <1> Module
> name
> > > >> should
> > > >>>>>>> better be a simple identifier rather than a string literal. <2>
> > > >>>> Property
> > > >>>>>>> `type` is redundant and should be removed, and mapping will
> rely
> > on
> > > >> the
> > > >>>>>>> module name because loading a module multiple times just using
> a
> > > >>>>>> different
> > > >>>>>>> module name doesn't make much sense. <3> We should migrate to
> the
> > > >> newer
> > > >>>>>> API
> > > >>>>>>> rather than the deprecated `TableFactory` class.
> > > >>>>>>>
> > > >>>>>>> Regarding #1, I think the point lies in whether changing the
> > > >> resolution
> > > >>>>>>> order implies an `unload` operation explicitly (i.e., users
> could
> > > >> sense
> > > >>>>>>> it). What do others think?
> > > >>>>>>>
> > > >>>>>>> Best,
> > > >>>>>>> Jane
> > > >>>>>>>
> > > >>>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <
> [hidden email]>
> > > >>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> IMHO I would rather unload the not mentioned modules. The
> > > statement
> > > >>>>>>>> expresses `USE` that implicilty implies that the other modules
> > are
> > > >>>> "not
> > > >>>>>>>> used". What do others think?
> > > >>>>>>>>
> > > >>>>>>>> Regards,
> > > >>>>>>>> Timo
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On 01.02.21 11:28, Jane Chan wrote:
> > > >>>>>>>>> Hi Jark and Rui,
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks for the discussions.
> > > >>>>>>>>>
> > > >>>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
> > > >>>>>>>>>
> > > >>>>>>>>>> It can be interpreted as "setting the current order of
> > modules",
> > > >>>>>> which
> > > >>>>>>>> is
> > > >>>>>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
> > > >>>>>>>>>>
> > > >>>>>>>>> I would like to confirm that the unmentioned modules remain
> in
> > > the
> > > >>>>>> same
> > > >>>>>>>>> relative order? E.g., if there are three loaded modules `X`,
> > `Y`,
> > > >>>>>> `Z`,
> > > >>>>>>>> then
> > > >>>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> > > >>>>>>>>>
> > > >>>>>>>>> Regarding #3, I'm fine with mapping modules purely by name,
> > and I
> > > >>>>>> think
> > > >>>>>>>>> Jark raised a good point on making the module name a simple
> > > >>>>>> identifier
> > > >>>>>>>>> instead of a string literal. For backward compatibility,
> since
> > we
> > > >>>>>>> haven't
> > > >>>>>>>>> supported this syntax yet, the affected users are those who
> > > defined
> > > >>>>>>>> modules
> > > >>>>>>>>> in the YAML configuration file. Maybe we can eliminate the
> > 'type'
> > > >>>>>> from
> > > >>>>>>>> the
> > > >>>>>>>>> 'requiredContext' to make it optional. Thus the proposed
> > mapping
> > > >>>>>>>> mechanism
> > > >>>>>>>>> could use the module name to lookup the suitable factory,
> and
> > in
> > > >> the
> > > >>>>>>>>> meanwhile updating documentation to encourage users to
> simplify
> > > >> their
> > > >>>>>>>> YAML
> > > >>>>>>>>> configuration. And in the long run, we can deprecate the
> > 'type'.
> > > >>>>>>>>>
> > > >>>>>>>>> Best,
> > > >>>>>>>>> Jane
> > > >>>>>>>>>
> > > >>>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]
> >
> > > >> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Thanks Jane for starting the discussion.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
> > > >>>>>>> interpreted
> > > >>>>>>>> as
> > > >>>>>>>>>> "setting the current order of modules", which is similar to
> > > >> "setting
> > > >>>>>>> the
> > > >>>>>>>>>> current catalog" for `USE CATALOG`.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Regarding #3, I'm fine to map modules purely by name
> because I
> > > >> think
> > > >>>>>>> it
> > > >>>>>>>>>> satisfies all the use cases we have at hand. But I guess we
> > need
> > > >> to
> > > >>>>>>> make
> > > >>>>>>>>>> sure we're backward compatible, i.e. users don't need to
> > change
> > > >>>>>> their
> > > >>>>>>>> yaml
> > > >>>>>>>>>> files to configure the modules.
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]>
> > > wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> Thanks Jane for the summary and starting the discussion in
> > the
> > > >>>>>>> mailing
> > > >>>>>>>>>>> list.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Here are my thoughts:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> 1) syntax to reorder modules
> > > >>>>>>>>>>> I agree with Rui Li it would be quite useful if we can have
> > > some
> > > >>>>>>> syntax
> > > >>>>>>>>>> to
> > > >>>>>>>>>>> reorder modules.
> > > >>>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD
> MODULES
> > x,
> > > >> y,
> > > >>>>>>> z`,
> > > >>>>>>>>>>> because USE has a more sense of effective and specifying
> > > >> ordering,
> > > >>>>>>> than
> > > >>>>>>>>>>> RELOAD.
> > > >>>>>>>>>>>     From my feeling, RELOAD just means we unregister and
> > > register
> > > >>>>>> x,y,z
> > > >>>>>>>>>> modules
> > > >>>>>>>>>>> again,
> > > >>>>>>>>>>> it sounds like other registered modules are still in use
> and
> > in
> > > >> the
> > > >>>>>>>>>> order.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> 3) mapping modules purely by name
> > > >>>>>>>>>>> This can definitely improve the usability of loading
> modules,
> > > >>>>>> because
> > > >>>>>>>>>>> the 'type=' property
> > > >>>>>>>>>>> looks really redundant. We can think of this as a syntax
> > sugar
> > > >> that
> > > >>>>>>> the
> > > >>>>>>>>>>> default type value is the module name.
> > > >>>>>>>>>>> And we can support to specify 'type=' property in the
> future
> > to
> > > >>>>>> allow
> > > >>>>>>>>>>> multiple modules for one module type.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Besides, I would like to mention one more change, that the
> > > module
> > > >>>>>>> name
> > > >>>>>>>>>>> proposed in FLIP-68 is a string literal.
> > > >>>>>>>>>>> But I think we are all on the same page to change it into a
> > > >> simple
> > > >>>>>>>>>>> (non-compound) identifier.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> LOAD/UNLOAD MODULE 'core'
> > > >>>>>>>>>>> ==>
> > > >>>>>>>>>>> LOAD/UNLOAD MODULE core
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Best,
> > > >>>>>>>>>>> Jark
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <
> > [hidden email]
> > > >
> > > >>>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> Hi everyone,
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1]
> about
> > > >>>>>>> supporting
> > > >>>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first
> > > >> proposed
> > > >>>>>> by
> > > >>>>>>>>>>>> FLIP-68 [2] as following.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> -- load a module with the given name and append it to the
> > end
> > > of
> > > >>>>>> the
> > > >>>>>>>>>>> module
> > > >>>>>>>>>>>> list
> > > >>>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp',
> > ...)]
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> --unload a module by name from the module list and other
> > > modules
> > > >>>>>>>> remain
> > > >>>>>>>>>>> in
> > > >>>>>>>>>>>> the same relative positions
> > > >>>>>>>>>>>> UNLOAD MODULE 'name'
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> After a round of discussion on the Jira ticket, it seems
> > some
> > > >>>>>>>>>> unanswered
> > > >>>>>>>>>>>> questions need more opinions and suggestions.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 1. The way to redefine resolution order easily
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>         Rui Li suggested introducing `USE MODULES` and
> > adding
> > > >>>> similar
> > > >>>>>>>>>>>> functionality to the API because
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>>      1) It's very tedious to unload old modules just to
> > > reorder
> > > >>>>>> them.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>      2) Users may not even know how to "re-load" an old
> > module
> > > >> if it
> > > >>>>>>> was
> > > >>>>>>>>>> not
> > > >>>>>>>>>>>>> initially loaded by the user, e.g. don't know which type
> to
> > > >> use.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>         Jane Chan wondered that module is not like the
> > catalog
> > > >> which
> > > >>>>>>> has
> > > >>>>>>>> a
> > > >>>>>>>>>>>> concept of namespace could specify, and `USE` sounds like
> a
> > > >>>>>>>>>>>> mutual-exclusive concept.
> > > >>>>>>>>>>>>         Maybe `RELOAD MODULES` can express upgrading the
> > > >> priority of
> > > >>>>>>> the
> > > >>>>>>>>>>> loaded
> > > >>>>>>>>>>>> module(s).
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> > > >>>>>>>>>>>>         Jark Wu and Nicholas Jiang proposed to use
> > > `CREATE/DROP
> > > >>>>>> MODULE`
> > > >>>>>>>>>>> instead
> > > >>>>>>>>>>>> of `LOAD/UNLOAD MODULE` because
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>>      1) From a pure SQL user's perspective, maybe `CREATE
> > > >> MODULE +
> > > >>>>>> USE
> > > >>>>>>>>>>>> MODULE`
> > > >>>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
> > > >>>>>>>>>>>>>      2) This will be very similar to what the catalog
> used
> > > now.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>       Timo Walther would rather stick to the agreed design
> > > >> because
> > > >>>>>>>>>>>> loading/unloading modules is a concept known from kernels
> > etc.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> 3. Simplify the module design by mapping modules purely by
> > > name
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> LOAD MODULE geo_utils
> > > >>>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> > > >>>>>>>>>>> 'type='/'module='
> > > >>>>>>>>>>>> but allow only 1 module to be loaded parameterized
> > > >>>>>>>>>>>> UNLOAD hive
> > > >>>>>>>>>>>> USE MODULES hive, core
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Please find more details in the reference link. Looking
> > > forward
> > > >> to
> > > >>>>>>>> your
> > > >>>>>>>>>>>> feedback.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> > > >>>>>>>>>>>> <
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>> [2]
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Best,
> > > >>>>>>>>>>>> Jane
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> --
> > > >>>>>>>>>> Best regards!
> > > >>>>>>>>>> Rui Li
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>>>
> > > >>>
> > > >>
> > > >>
> > > >
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Timo Walther-2
Thanks for the nice summary Jane. The summary looks great. Some minor
feedback:

- Remove the `used` column for SHOW MODULES. It will always show true.

- `List<Pair<String, Boolean>> listFullModules()` is a very long
signature. And `Pair` should be avoided in code because it is not very
descriptive. How about creating a POJO (static inner class of
ModuleManager) called `ModuleEntry` or similar.

Otherwise +1 for the proposal.

Regards,
Timo

On 03.02.21 11:24, Jane Chan wrote:

> Hi everyone,
>
> I did a summary on the Jira issue page [1] since the discussion has
> achieved a consensus. If there is anything missed or not corrected, please
> let me know.
>
> [1] https://issues.apache.org/jira/browse/FLINK-21045#
>
> Best,
> Jane
>
>
>
>
>
> On Wed, Feb 3, 2021 at 1:33 PM Jark Wu <[hidden email]> wrote:
>
>> Hi Jane,
>>
>> Yes. I think we should fail fast.
>>
>> Best,
>> Jark
>>
>> On Wed, 3 Feb 2021 at 12:06, Jane Chan <[hidden email]> wrote:
>>
>>> Hi everyone,
>>>
>>> Thanks for the discussion to make this improvement plan clearer.
>>>
>>> Hi, @Jark, @Rui, and @Timo, I'm collecting the final discussion summaries
>>> now and want to confirm one thing that for the statement `USE MODULES x
>> [,
>>> y, z, ...]`, if the module name list contains an unexsited module, shall
>> we
>>> #1 fail the execution for all of them or #2 enabled the rest modules and
>>> return a warning to users? My personal preference goes to #1 for
>>> simplicity. What do you think?
>>>
>>> Best,
>>> Jane
>>>
>>> On Tue, Feb 2, 2021 at 3:53 PM Timo Walther <[hidden email]> wrote:
>>>
>>>> +1
>>>>
>>>> @Jane Can you summarize our discussion in the JIRA issue?
>>>>
>>>> Thanks,
>>>> Timo
>>>>
>>>>
>>>> On 02.02.21 03:50, Jark Wu wrote:
>>>>> Hi Timo,
>>>>>
>>>>>> Another question is whether a LOAD operation also adds the module to
>>> the
>>>>> enabled list by default?
>>>>>
>>>>> I would like to add the module to the enabled list by default, the
>> main
>>>>> reasons are:
>>>>> 1) Reordering is an advanced requirement, adding modules needs
>>> additional
>>>>> USE statements with "core" module
>>>>>    sounds too burdensome. Most users should be satisfied with only
>> LOAD
>>>>> statements.
>>>>> 2) We should keep compatible for TableEnvironment#loadModule().
>>>>> 3) We are using the LOAD statement instead of CREATE, so I think it's
>>>> fine
>>>>> that it does some implicit things.
>>>>>
>>>>> Best,
>>>>> Jark
>>>>>
>>>>> On Tue, 2 Feb 2021 at 00:48, Timo Walther <[hidden email]>
>> wrote:
>>>>>
>>>>>> Not the module itself but the ModuleManager should handle this case,
>>>> yes.
>>>>>>
>>>>>> Regards,
>>>>>> Timo
>>>>>>
>>>>>>
>>>>>> On 01.02.21 17:35, Jane Chan wrote:
>>>>>>> +1 to Jark's proposal
>>>>>>>
>>>>>>>     To make it clearer,  will `module#getFunctionDefinition()`
>> return
>>>> empty
>>>>>>> suppose the module is loaded but not enabled?
>>>>>>>
>>>>>>> Best,
>>>>>>> Jane
>>>>>>>
>>>>>>> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]>
>>>> wrote:
>>>>>>>
>>>>>>>> +1 to Jark's proposal
>>>>>>>>
>>>>>>>> I like the difference between just loading and actually enabling
>>> these
>>>>>>>> modules.
>>>>>>>>
>>>>>>>> @Rui: I would use the same behavior as catalogs here. You cannot
>>>> `USE` a
>>>>>>>> catalog without creating it before.
>>>>>>>>
>>>>>>>> Another question is whether a LOAD operation also adds the module
>> to
>>>> the
>>>>>>>> enabled list by default?
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> Timo
>>>>>>>>
>>>>>>>> On 01.02.21 13:52, Rui Li wrote:
>>>>>>>>> If `USE MODULES` implies unloading modules that are not listed,
>>> does
>>>> it
>>>>>>>>> also imply loading modules that are not previously loaded,
>>> especially
>>>>>>>> since
>>>>>>>>> we're mapping modules by name now?
>>>>>>>>>
>>>>>>>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
>>>>>>>>>
>>>>>>>>>> I agree with Timo that the USE implies the specified modules are
>>> in
>>>>>> use
>>>>>>>> in
>>>>>>>>>> the specified order and others are not used.
>>>>>>>>>> This would be easier to know what's the result list and order
>>> after
>>>>>> the
>>>>>>>> USE
>>>>>>>>>> statement.
>>>>>>>>>> That means: if current modules in order are x, y, z. And `USE
>>>> MODULES
>>>>>>>> z, y`
>>>>>>>>>> means current modules in order are z, y.
>>>>>>>>>>
>>>>>>>>>> But I would like to not unload the unmentioned modules in the
>> USE
>>>>>>>>>> statement. Because it seems strange that USE
>>>>>>>>>> will implicitly remove modules. In the above example, the user
>> may
>>>>>> type
>>>>>>>> the
>>>>>>>>>> wrong modules list using USE by mistake
>>>>>>>>>>      and would like to declare the list again, the user has to
>>> create
>>>>>> the
>>>>>>>>>> module again with some properties he may don't know. Therefore,
>> I
>>>>>>>> propose
>>>>>>>>>> the USE statement just specifies the current module lists and
>>>> doesn't
>>>>>>>>>> unload modules.
>>>>>>>>>> Besides that, we may need a new syntax to list all the modules
>>>>>> including
>>>>>>>>>> not used but loaded.
>>>>>>>>>> We can introduce SHOW FULL MODULES for this purpose with an
>>>> additional
>>>>>>>>>> `used` column.
>>>>>>>>>>
>>>>>>>>>> For example:
>>>>>>>>>>
>>>>>>>>>> Flink SQL> list modules:
>>>>>>>>>> -----------
>>>>>>>>>> | modules |
>>>>>>>>>> -----------
>>>>>>>>>> | x       |
>>>>>>>>>> | y       |
>>>>>>>>>> | z       |
>>>>>>>>>> -----------
>>>>>>>>>> Flink SQL> USE MODULES z, y;
>>>>>>>>>> Flink SQL> show modules:
>>>>>>>>>> -----------
>>>>>>>>>> | modules |
>>>>>>>>>> -----------
>>>>>>>>>> | z       |
>>>>>>>>>> | y       |
>>>>>>>>>> -----------
>>>>>>>>>> Flink SQL> show FULL modules;
>>>>>>>>>> -------------------
>>>>>>>>>> | modules |  used |
>>>>>>>>>> -------------------
>>>>>>>>>> | z       | true  |
>>>>>>>>>> | y       | true  |
>>>>>>>>>> | x       | false |
>>>>>>>>>> -------------------
>>>>>>>>>> Flink SQL> USE MODULES z, y, x;
>>>>>>>>>> Flink SQL> show modules;
>>>>>>>>>> -----------
>>>>>>>>>> | modules |
>>>>>>>>>> -----------
>>>>>>>>>> | z       |
>>>>>>>>>> | y       |
>>>>>>>>>> | x       |
>>>>>>>>>> -----------
>>>>>>>>>>
>>>>>>>>>> What do you think?
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Jark
>>>>>>>>>>
>>>>>>>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]>
>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Timo, thanks for the discussion.
>>>>>>>>>>>
>>>>>>>>>>> It seems to reach an agreement regarding #3 that <1> Module
>> name
>>>>>> should
>>>>>>>>>>> better be a simple identifier rather than a string literal. <2>
>>>>>>>> Property
>>>>>>>>>>> `type` is redundant and should be removed, and mapping will
>> rely
>>> on
>>>>>> the
>>>>>>>>>>> module name because loading a module multiple times just using
>> a
>>>>>>>>>> different
>>>>>>>>>>> module name doesn't make much sense. <3> We should migrate to
>> the
>>>>>> newer
>>>>>>>>>> API
>>>>>>>>>>> rather than the deprecated `TableFactory` class.
>>>>>>>>>>>
>>>>>>>>>>> Regarding #1, I think the point lies in whether changing the
>>>>>> resolution
>>>>>>>>>>> order implies an `unload` operation explicitly (i.e., users
>> could
>>>>>> sense
>>>>>>>>>>> it). What do others think?
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Jane
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <
>> [hidden email]>
>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> IMHO I would rather unload the not mentioned modules. The
>>>> statement
>>>>>>>>>>>> expresses `USE` that implicilty implies that the other modules
>>> are
>>>>>>>> "not
>>>>>>>>>>>> used". What do others think?
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Timo
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 01.02.21 11:28, Jane Chan wrote:
>>>>>>>>>>>>> Hi Jark and Rui,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the discussions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It can be interpreted as "setting the current order of
>>> modules",
>>>>>>>>>> which
>>>>>>>>>>>> is
>>>>>>>>>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I would like to confirm that the unmentioned modules remain
>> in
>>>> the
>>>>>>>>>> same
>>>>>>>>>>>>> relative order? E.g., if there are three loaded modules `X`,
>>> `Y`,
>>>>>>>>>> `Z`,
>>>>>>>>>>>> then
>>>>>>>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regarding #3, I'm fine with mapping modules purely by name,
>>> and I
>>>>>>>>>> think
>>>>>>>>>>>>> Jark raised a good point on making the module name a simple
>>>>>>>>>> identifier
>>>>>>>>>>>>> instead of a string literal. For backward compatibility,
>> since
>>> we
>>>>>>>>>>> haven't
>>>>>>>>>>>>> supported this syntax yet, the affected users are those who
>>>> defined
>>>>>>>>>>>> modules
>>>>>>>>>>>>> in the YAML configuration file. Maybe we can eliminate the
>>> 'type'
>>>>>>>>>> from
>>>>>>>>>>>> the
>>>>>>>>>>>>> 'requiredContext' to make it optional. Thus the proposed
>>> mapping
>>>>>>>>>>>> mechanism
>>>>>>>>>>>>> could use the module name to lookup the suitable factory,
>> and
>>> in
>>>>>> the
>>>>>>>>>>>>> meanwhile updating documentation to encourage users to
>> simplify
>>>>>> their
>>>>>>>>>>>> YAML
>>>>>>>>>>>>> configuration. And in the long run, we can deprecate the
>>> 'type'.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]
>>>
>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks Jane for starting the discussion.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
>>>>>>>>>>> interpreted
>>>>>>>>>>>> as
>>>>>>>>>>>>>> "setting the current order of modules", which is similar to
>>>>>> "setting
>>>>>>>>>>> the
>>>>>>>>>>>>>> current catalog" for `USE CATALOG`.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Regarding #3, I'm fine to map modules purely by name
>> because I
>>>>>> think
>>>>>>>>>>> it
>>>>>>>>>>>>>> satisfies all the use cases we have at hand. But I guess we
>>> need
>>>>>> to
>>>>>>>>>>> make
>>>>>>>>>>>>>> sure we're backward compatible, i.e. users don't need to
>>> change
>>>>>>>>>> their
>>>>>>>>>>>> yaml
>>>>>>>>>>>>>> files to configure the modules.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]>
>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks Jane for the summary and starting the discussion in
>>> the
>>>>>>>>>>> mailing
>>>>>>>>>>>>>>> list.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Here are my thoughts:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1) syntax to reorder modules
>>>>>>>>>>>>>>> I agree with Rui Li it would be quite useful if we can have
>>>> some
>>>>>>>>>>> syntax
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> reorder modules.
>>>>>>>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD
>> MODULES
>>> x,
>>>>>> y,
>>>>>>>>>>> z`,
>>>>>>>>>>>>>>> because USE has a more sense of effective and specifying
>>>>>> ordering,
>>>>>>>>>>> than
>>>>>>>>>>>>>>> RELOAD.
>>>>>>>>>>>>>>>      From my feeling, RELOAD just means we unregister and
>>>> register
>>>>>>>>>> x,y,z
>>>>>>>>>>>>>> modules
>>>>>>>>>>>>>>> again,
>>>>>>>>>>>>>>> it sounds like other registered modules are still in use
>> and
>>> in
>>>>>> the
>>>>>>>>>>>>>> order.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3) mapping modules purely by name
>>>>>>>>>>>>>>> This can definitely improve the usability of loading
>> modules,
>>>>>>>>>> because
>>>>>>>>>>>>>>> the 'type=' property
>>>>>>>>>>>>>>> looks really redundant. We can think of this as a syntax
>>> sugar
>>>>>> that
>>>>>>>>>>> the
>>>>>>>>>>>>>>> default type value is the module name.
>>>>>>>>>>>>>>> And we can support to specify 'type=' property in the
>> future
>>> to
>>>>>>>>>> allow
>>>>>>>>>>>>>>> multiple modules for one module type.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Besides, I would like to mention one more change, that the
>>>> module
>>>>>>>>>>> name
>>>>>>>>>>>>>>> proposed in FLIP-68 is a string literal.
>>>>>>>>>>>>>>> But I think we are all on the same page to change it into a
>>>>>> simple
>>>>>>>>>>>>>>> (non-compound) identifier.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> LOAD/UNLOAD MODULE 'core'
>>>>>>>>>>>>>>> ==>
>>>>>>>>>>>>>>> LOAD/UNLOAD MODULE core
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <
>>> [hidden email]
>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1]
>> about
>>>>>>>>>>> supporting
>>>>>>>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first
>>>>>> proposed
>>>>>>>>>> by
>>>>>>>>>>>>>>>> FLIP-68 [2] as following.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -- load a module with the given name and append it to the
>>> end
>>>> of
>>>>>>>>>> the
>>>>>>>>>>>>>>> module
>>>>>>>>>>>>>>>> list
>>>>>>>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp',
>>> ...)]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --unload a module by name from the module list and other
>>>> modules
>>>>>>>>>>>> remain
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>> the same relative positions
>>>>>>>>>>>>>>>> UNLOAD MODULE 'name'
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> After a round of discussion on the Jira ticket, it seems
>>> some
>>>>>>>>>>>>>> unanswered
>>>>>>>>>>>>>>>> questions need more opinions and suggestions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. The way to redefine resolution order easily
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>          Rui Li suggested introducing `USE MODULES` and
>>> adding
>>>>>>>> similar
>>>>>>>>>>>>>>>> functionality to the API because
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       1) It's very tedious to unload old modules just to
>>>> reorder
>>>>>>>>>> them.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>       2) Users may not even know how to "re-load" an old
>>> module
>>>>>> if it
>>>>>>>>>>> was
>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> initially loaded by the user, e.g. don't know which type
>> to
>>>>>> use.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>          Jane Chan wondered that module is not like the
>>> catalog
>>>>>> which
>>>>>>>>>>> has
>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> concept of namespace could specify, and `USE` sounds like
>> a
>>>>>>>>>>>>>>>> mutual-exclusive concept.
>>>>>>>>>>>>>>>>          Maybe `RELOAD MODULES` can express upgrading the
>>>>>> priority of
>>>>>>>>>>> the
>>>>>>>>>>>>>>> loaded
>>>>>>>>>>>>>>>> module(s).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
>>>>>>>>>>>>>>>>          Jark Wu and Nicholas Jiang proposed to use
>>>> `CREATE/DROP
>>>>>>>>>> MODULE`
>>>>>>>>>>>>>>> instead
>>>>>>>>>>>>>>>> of `LOAD/UNLOAD MODULE` because
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>       1) From a pure SQL user's perspective, maybe `CREATE
>>>>>> MODULE +
>>>>>>>>>> USE
>>>>>>>>>>>>>>>> MODULE`
>>>>>>>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
>>>>>>>>>>>>>>>>>       2) This will be very similar to what the catalog
>> used
>>>> now.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>        Timo Walther would rather stick to the agreed design
>>>>>> because
>>>>>>>>>>>>>>>> loading/unloading modules is a concept known from kernels
>>> etc.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3. Simplify the module design by mapping modules purely by
>>>> name
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> LOAD MODULE geo_utils
>>>>>>>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
>>>>>>>>>>>>>>> 'type='/'module='
>>>>>>>>>>>>>>>> but allow only 1 module to be loaded parameterized
>>>>>>>>>>>>>>>> UNLOAD hive
>>>>>>>>>>>>>>>> USE MODULES hive, core
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please find more details in the reference link. Looking
>>>> forward
>>>>>> to
>>>>>>>>>>>> your
>>>>>>>>>>>>>>>> feedback.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
>>>>>>>>>>>>>>>> <
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Best regards!
>>>>>>>>>>>>>> Rui Li
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jark Wu-2
A minor comment on `useModules(List<String> names)`,
would be better to use varargs here to a more fluent API: `useModules("a",
"b", "c")`.

Besides, do we also need to add these new methods (useModules,
listFullModules) to
TableEnvironment?

Best,
Jark

On Wed, 3 Feb 2021 at 18:36, Timo Walther <[hidden email]> wrote:

> Thanks for the nice summary Jane. The summary looks great. Some minor
> feedback:
>
> - Remove the `used` column for SHOW MODULES. It will always show true.
>
> - `List<Pair<String, Boolean>> listFullModules()` is a very long
> signature. And `Pair` should be avoided in code because it is not very
> descriptive. How about creating a POJO (static inner class of
> ModuleManager) called `ModuleEntry` or similar.
>
> Otherwise +1 for the proposal.
>
> Regards,
> Timo
>
> On 03.02.21 11:24, Jane Chan wrote:
> > Hi everyone,
> >
> > I did a summary on the Jira issue page [1] since the discussion has
> > achieved a consensus. If there is anything missed or not corrected,
> please
> > let me know.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-21045#
> >
> > Best,
> > Jane
> >
> >
> >
> >
> >
> > On Wed, Feb 3, 2021 at 1:33 PM Jark Wu <[hidden email]> wrote:
> >
> >> Hi Jane,
> >>
> >> Yes. I think we should fail fast.
> >>
> >> Best,
> >> Jark
> >>
> >> On Wed, 3 Feb 2021 at 12:06, Jane Chan <[hidden email]> wrote:
> >>
> >>> Hi everyone,
> >>>
> >>> Thanks for the discussion to make this improvement plan clearer.
> >>>
> >>> Hi, @Jark, @Rui, and @Timo, I'm collecting the final discussion
> summaries
> >>> now and want to confirm one thing that for the statement `USE MODULES x
> >> [,
> >>> y, z, ...]`, if the module name list contains an unexsited module,
> shall
> >> we
> >>> #1 fail the execution for all of them or #2 enabled the rest modules
> and
> >>> return a warning to users? My personal preference goes to #1 for
> >>> simplicity. What do you think?
> >>>
> >>> Best,
> >>> Jane
> >>>
> >>> On Tue, Feb 2, 2021 at 3:53 PM Timo Walther <[hidden email]>
> wrote:
> >>>
> >>>> +1
> >>>>
> >>>> @Jane Can you summarize our discussion in the JIRA issue?
> >>>>
> >>>> Thanks,
> >>>> Timo
> >>>>
> >>>>
> >>>> On 02.02.21 03:50, Jark Wu wrote:
> >>>>> Hi Timo,
> >>>>>
> >>>>>> Another question is whether a LOAD operation also adds the module to
> >>> the
> >>>>> enabled list by default?
> >>>>>
> >>>>> I would like to add the module to the enabled list by default, the
> >> main
> >>>>> reasons are:
> >>>>> 1) Reordering is an advanced requirement, adding modules needs
> >>> additional
> >>>>> USE statements with "core" module
> >>>>>    sounds too burdensome. Most users should be satisfied with only
> >> LOAD
> >>>>> statements.
> >>>>> 2) We should keep compatible for TableEnvironment#loadModule().
> >>>>> 3) We are using the LOAD statement instead of CREATE, so I think it's
> >>>> fine
> >>>>> that it does some implicit things.
> >>>>>
> >>>>> Best,
> >>>>> Jark
> >>>>>
> >>>>> On Tue, 2 Feb 2021 at 00:48, Timo Walther <[hidden email]>
> >> wrote:
> >>>>>
> >>>>>> Not the module itself but the ModuleManager should handle this case,
> >>>> yes.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Timo
> >>>>>>
> >>>>>>
> >>>>>> On 01.02.21 17:35, Jane Chan wrote:
> >>>>>>> +1 to Jark's proposal
> >>>>>>>
> >>>>>>>     To make it clearer,  will `module#getFunctionDefinition()`
> >> return
> >>>> empty
> >>>>>>> suppose the module is loaded but not enabled?
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Jane
> >>>>>>>
> >>>>>>> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> +1 to Jark's proposal
> >>>>>>>>
> >>>>>>>> I like the difference between just loading and actually enabling
> >>> these
> >>>>>>>> modules.
> >>>>>>>>
> >>>>>>>> @Rui: I would use the same behavior as catalogs here. You cannot
> >>>> `USE` a
> >>>>>>>> catalog without creating it before.
> >>>>>>>>
> >>>>>>>> Another question is whether a LOAD operation also adds the module
> >> to
> >>>> the
> >>>>>>>> enabled list by default?
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Timo
> >>>>>>>>
> >>>>>>>> On 01.02.21 13:52, Rui Li wrote:
> >>>>>>>>> If `USE MODULES` implies unloading modules that are not listed,
> >>> does
> >>>> it
> >>>>>>>>> also imply loading modules that are not previously loaded,
> >>> especially
> >>>>>>>> since
> >>>>>>>>> we're mapping modules by name now?
> >>>>>>>>>
> >>>>>>>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]> wrote:
> >>>>>>>>>
> >>>>>>>>>> I agree with Timo that the USE implies the specified modules are
> >>> in
> >>>>>> use
> >>>>>>>> in
> >>>>>>>>>> the specified order and others are not used.
> >>>>>>>>>> This would be easier to know what's the result list and order
> >>> after
> >>>>>> the
> >>>>>>>> USE
> >>>>>>>>>> statement.
> >>>>>>>>>> That means: if current modules in order are x, y, z. And `USE
> >>>> MODULES
> >>>>>>>> z, y`
> >>>>>>>>>> means current modules in order are z, y.
> >>>>>>>>>>
> >>>>>>>>>> But I would like to not unload the unmentioned modules in the
> >> USE
> >>>>>>>>>> statement. Because it seems strange that USE
> >>>>>>>>>> will implicitly remove modules. In the above example, the user
> >> may
> >>>>>> type
> >>>>>>>> the
> >>>>>>>>>> wrong modules list using USE by mistake
> >>>>>>>>>>      and would like to declare the list again, the user has to
> >>> create
> >>>>>> the
> >>>>>>>>>> module again with some properties he may don't know. Therefore,
> >> I
> >>>>>>>> propose
> >>>>>>>>>> the USE statement just specifies the current module lists and
> >>>> doesn't
> >>>>>>>>>> unload modules.
> >>>>>>>>>> Besides that, we may need a new syntax to list all the modules
> >>>>>> including
> >>>>>>>>>> not used but loaded.
> >>>>>>>>>> We can introduce SHOW FULL MODULES for this purpose with an
> >>>> additional
> >>>>>>>>>> `used` column.
> >>>>>>>>>>
> >>>>>>>>>> For example:
> >>>>>>>>>>
> >>>>>>>>>> Flink SQL> list modules:
> >>>>>>>>>> -----------
> >>>>>>>>>> | modules |
> >>>>>>>>>> -----------
> >>>>>>>>>> | x       |
> >>>>>>>>>> | y       |
> >>>>>>>>>> | z       |
> >>>>>>>>>> -----------
> >>>>>>>>>> Flink SQL> USE MODULES z, y;
> >>>>>>>>>> Flink SQL> show modules:
> >>>>>>>>>> -----------
> >>>>>>>>>> | modules |
> >>>>>>>>>> -----------
> >>>>>>>>>> | z       |
> >>>>>>>>>> | y       |
> >>>>>>>>>> -----------
> >>>>>>>>>> Flink SQL> show FULL modules;
> >>>>>>>>>> -------------------
> >>>>>>>>>> | modules |  used |
> >>>>>>>>>> -------------------
> >>>>>>>>>> | z       | true  |
> >>>>>>>>>> | y       | true  |
> >>>>>>>>>> | x       | false |
> >>>>>>>>>> -------------------
> >>>>>>>>>> Flink SQL> USE MODULES z, y, x;
> >>>>>>>>>> Flink SQL> show modules;
> >>>>>>>>>> -----------
> >>>>>>>>>> | modules |
> >>>>>>>>>> -----------
> >>>>>>>>>> | z       |
> >>>>>>>>>> | y       |
> >>>>>>>>>> | x       |
> >>>>>>>>>> -----------
> >>>>>>>>>>
> >>>>>>>>>> What do you think?
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Jark
> >>>>>>>>>>
> >>>>>>>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]>
> >>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Timo, thanks for the discussion.
> >>>>>>>>>>>
> >>>>>>>>>>> It seems to reach an agreement regarding #3 that <1> Module
> >> name
> >>>>>> should
> >>>>>>>>>>> better be a simple identifier rather than a string literal. <2>
> >>>>>>>> Property
> >>>>>>>>>>> `type` is redundant and should be removed, and mapping will
> >> rely
> >>> on
> >>>>>> the
> >>>>>>>>>>> module name because loading a module multiple times just using
> >> a
> >>>>>>>>>> different
> >>>>>>>>>>> module name doesn't make much sense. <3> We should migrate to
> >> the
> >>>>>> newer
> >>>>>>>>>> API
> >>>>>>>>>>> rather than the deprecated `TableFactory` class.
> >>>>>>>>>>>
> >>>>>>>>>>> Regarding #1, I think the point lies in whether changing the
> >>>>>> resolution
> >>>>>>>>>>> order implies an `unload` operation explicitly (i.e., users
> >> could
> >>>>>> sense
> >>>>>>>>>>> it). What do others think?
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Jane
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <
> >> [hidden email]>
> >>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> IMHO I would rather unload the not mentioned modules. The
> >>>> statement
> >>>>>>>>>>>> expresses `USE` that implicilty implies that the other modules
> >>> are
> >>>>>>>> "not
> >>>>>>>>>>>> used". What do others think?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Timo
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 01.02.21 11:28, Jane Chan wrote:
> >>>>>>>>>>>>> Hi Jark and Rui,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for the discussions.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> It can be interpreted as "setting the current order of
> >>> modules",
> >>>>>>>>>> which
> >>>>>>>>>>>> is
> >>>>>>>>>>>>>> similar to "setting the current catalog" for `USE CATALOG`.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> I would like to confirm that the unmentioned modules remain
> >> in
> >>>> the
> >>>>>>>>>> same
> >>>>>>>>>>>>> relative order? E.g., if there are three loaded modules `X`,
> >>> `Y`,
> >>>>>>>>>> `Z`,
> >>>>>>>>>>>> then
> >>>>>>>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`, `X`.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regarding #3, I'm fine with mapping modules purely by name,
> >>> and I
> >>>>>>>>>> think
> >>>>>>>>>>>>> Jark raised a good point on making the module name a simple
> >>>>>>>>>> identifier
> >>>>>>>>>>>>> instead of a string literal. For backward compatibility,
> >> since
> >>> we
> >>>>>>>>>>> haven't
> >>>>>>>>>>>>> supported this syntax yet, the affected users are those who
> >>>> defined
> >>>>>>>>>>>> modules
> >>>>>>>>>>>>> in the YAML configuration file. Maybe we can eliminate the
> >>> 'type'
> >>>>>>>>>> from
> >>>>>>>>>>>> the
> >>>>>>>>>>>>> 'requiredContext' to make it optional. Thus the proposed
> >>> mapping
> >>>>>>>>>>>> mechanism
> >>>>>>>>>>>>> could use the module name to lookup the suitable factory,
> >> and
> >>> in
> >>>>>> the
> >>>>>>>>>>>>> meanwhile updating documentation to encourage users to
> >> simplify
> >>>>>> their
> >>>>>>>>>>>> YAML
> >>>>>>>>>>>>> configuration. And in the long run, we can deprecate the
> >>> 'type'.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best,
> >>>>>>>>>>>>> Jane
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <[hidden email]
> >>>
> >>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks Jane for starting the discussion.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can be
> >>>>>>>>>>> interpreted
> >>>>>>>>>>>> as
> >>>>>>>>>>>>>> "setting the current order of modules", which is similar to
> >>>>>> "setting
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> current catalog" for `USE CATALOG`.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regarding #3, I'm fine to map modules purely by name
> >> because I
> >>>>>> think
> >>>>>>>>>>> it
> >>>>>>>>>>>>>> satisfies all the use cases we have at hand. But I guess we
> >>> need
> >>>>>> to
> >>>>>>>>>>> make
> >>>>>>>>>>>>>> sure we're backward compatible, i.e. users don't need to
> >>> change
> >>>>>>>>>> their
> >>>>>>>>>>>> yaml
> >>>>>>>>>>>>>> files to configure the modules.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]>
> >>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks Jane for the summary and starting the discussion in
> >>> the
> >>>>>>>>>>> mailing
> >>>>>>>>>>>>>>> list.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Here are my thoughts:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 1) syntax to reorder modules
> >>>>>>>>>>>>>>> I agree with Rui Li it would be quite useful if we can have
> >>>> some
> >>>>>>>>>>> syntax
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> reorder modules.
> >>>>>>>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD
> >> MODULES
> >>> x,
> >>>>>> y,
> >>>>>>>>>>> z`,
> >>>>>>>>>>>>>>> because USE has a more sense of effective and specifying
> >>>>>> ordering,
> >>>>>>>>>>> than
> >>>>>>>>>>>>>>> RELOAD.
> >>>>>>>>>>>>>>>      From my feeling, RELOAD just means we unregister and
> >>>> register
> >>>>>>>>>> x,y,z
> >>>>>>>>>>>>>> modules
> >>>>>>>>>>>>>>> again,
> >>>>>>>>>>>>>>> it sounds like other registered modules are still in use
> >> and
> >>> in
> >>>>>> the
> >>>>>>>>>>>>>> order.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 3) mapping modules purely by name
> >>>>>>>>>>>>>>> This can definitely improve the usability of loading
> >> modules,
> >>>>>>>>>> because
> >>>>>>>>>>>>>>> the 'type=' property
> >>>>>>>>>>>>>>> looks really redundant. We can think of this as a syntax
> >>> sugar
> >>>>>> that
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> default type value is the module name.
> >>>>>>>>>>>>>>> And we can support to specify 'type=' property in the
> >> future
> >>> to
> >>>>>>>>>> allow
> >>>>>>>>>>>>>>> multiple modules for one module type.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Besides, I would like to mention one more change, that the
> >>>> module
> >>>>>>>>>>> name
> >>>>>>>>>>>>>>> proposed in FLIP-68 is a string literal.
> >>>>>>>>>>>>>>> But I think we are all on the same page to change it into a
> >>>>>> simple
> >>>>>>>>>>>>>>> (non-compound) identifier.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> LOAD/UNLOAD MODULE 'core'
> >>>>>>>>>>>>>>> ==>
> >>>>>>>>>>>>>>> LOAD/UNLOAD MODULE core
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <
> >>> [hidden email]
> >>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1]
> >> about
> >>>>>>>>>>> supporting
> >>>>>>>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first
> >>>>>> proposed
> >>>>>>>>>> by
> >>>>>>>>>>>>>>>> FLIP-68 [2] as following.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> -- load a module with the given name and append it to the
> >>> end
> >>>> of
> >>>>>>>>>> the
> >>>>>>>>>>>>>>> module
> >>>>>>>>>>>>>>>> list
> >>>>>>>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp',
> >>> ...)]
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> --unload a module by name from the module list and other
> >>>> modules
> >>>>>>>>>>>> remain
> >>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> the same relative positions
> >>>>>>>>>>>>>>>> UNLOAD MODULE 'name'
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> After a round of discussion on the Jira ticket, it seems
> >>> some
> >>>>>>>>>>>>>> unanswered
> >>>>>>>>>>>>>>>> questions need more opinions and suggestions.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 1. The way to redefine resolution order easily
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>          Rui Li suggested introducing `USE MODULES` and
> >>> adding
> >>>>>>>> similar
> >>>>>>>>>>>>>>>> functionality to the API because
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       1) It's very tedious to unload old modules just to
> >>>> reorder
> >>>>>>>>>> them.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>       2) Users may not even know how to "re-load" an old
> >>> module
> >>>>>> if it
> >>>>>>>>>>> was
> >>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>> initially loaded by the user, e.g. don't know which type
> >> to
> >>>>>> use.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>          Jane Chan wondered that module is not like the
> >>> catalog
> >>>>>> which
> >>>>>>>>>>> has
> >>>>>>>>>>>> a
> >>>>>>>>>>>>>>>> concept of namespace could specify, and `USE` sounds like
> >> a
> >>>>>>>>>>>>>>>> mutual-exclusive concept.
> >>>>>>>>>>>>>>>>          Maybe `RELOAD MODULES` can express upgrading the
> >>>>>> priority of
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> loaded
> >>>>>>>>>>>>>>>> module(s).
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> >>>>>>>>>>>>>>>>          Jark Wu and Nicholas Jiang proposed to use
> >>>> `CREATE/DROP
> >>>>>>>>>> MODULE`
> >>>>>>>>>>>>>>> instead
> >>>>>>>>>>>>>>>> of `LOAD/UNLOAD MODULE` because
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>       1) From a pure SQL user's perspective, maybe
> `CREATE
> >>>>>> MODULE +
> >>>>>>>>>> USE
> >>>>>>>>>>>>>>>> MODULE`
> >>>>>>>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
> >>>>>>>>>>>>>>>>>       2) This will be very similar to what the catalog
> >> used
> >>>> now.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>        Timo Walther would rather stick to the agreed
> design
> >>>>>> because
> >>>>>>>>>>>>>>>> loading/unloading modules is a concept known from kernels
> >>> etc.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> 3. Simplify the module design by mapping modules purely by
> >>>> name
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> LOAD MODULE geo_utils
> >>>>>>>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> >>>>>>>>>>>>>>> 'type='/'module='
> >>>>>>>>>>>>>>>> but allow only 1 module to be loaded parameterized
> >>>>>>>>>>>>>>>> UNLOAD hive
> >>>>>>>>>>>>>>>> USE MODULES hive, core
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Please find more details in the reference link. Looking
> >>>> forward
> >>>>>> to
> >>>>>>>>>>>> your
> >>>>>>>>>>>>>>>> feedback.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> >>>>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>> Jane
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> Best regards!
> >>>>>>>>>>>>>> Rui Li
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLINK-21045: Support 'load module' and 'unload module' SQL syntax

Jane Chan
Reply @Timo

> Remove the `used` column for SHOW MODULES. It will always show true.
>
Good catch. It's a copy-paste typo, and I forgot to remove that column.

How about creating a POJO (static inner class of ModuleManager) called
> `ModuleEntry` or similar.
>
+1 for better encapsulation.

Reply @Jark

> A minor comment on `useModules(List<String> names)`, would be better to
> use varargs here to a more fluent API: `useModules("a", "b", "c")`.
>
 +1, and that's better.

Do we also need to add these new methods (useModules, listFullModules)
> to TableEnvironment?
>
Yes, indeed.

Thank you all for polishing this proposal to make it more thorough.

Best,
Jane

On Wed, Feb 3, 2021 at 6:41 PM Jark Wu <[hidden email]> wrote:

> A minor comment on `useModules(List<String> names)`,
> would be better to use varargs here to a more fluent API: `useModules("a",
> "b", "c")`.
>
> Besides, do we also need to add these new methods (useModules,
> listFullModules) to
> TableEnvironment?
>
> Best,
> Jark
>
> On Wed, 3 Feb 2021 at 18:36, Timo Walther <[hidden email]> wrote:
>
> > Thanks for the nice summary Jane. The summary looks great. Some minor
> > feedback:
> >
> > - Remove the `used` column for SHOW MODULES. It will always show true.
> >
> > - `List<Pair<String, Boolean>> listFullModules()` is a very long
> > signature. And `Pair` should be avoided in code because it is not very
> > descriptive. How about creating a POJO (static inner class of
> > ModuleManager) called `ModuleEntry` or similar.
> >
> > Otherwise +1 for the proposal.
> >
> > Regards,
> > Timo
> >
> > On 03.02.21 11:24, Jane Chan wrote:
> > > Hi everyone,
> > >
> > > I did a summary on the Jira issue page [1] since the discussion has
> > > achieved a consensus. If there is anything missed or not corrected,
> > please
> > > let me know.
> > >
> > > [1] https://issues.apache.org/jira/browse/FLINK-21045#
> > >
> > > Best,
> > > Jane
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Feb 3, 2021 at 1:33 PM Jark Wu <[hidden email]> wrote:
> > >
> > >> Hi Jane,
> > >>
> > >> Yes. I think we should fail fast.
> > >>
> > >> Best,
> > >> Jark
> > >>
> > >> On Wed, 3 Feb 2021 at 12:06, Jane Chan <[hidden email]> wrote:
> > >>
> > >>> Hi everyone,
> > >>>
> > >>> Thanks for the discussion to make this improvement plan clearer.
> > >>>
> > >>> Hi, @Jark, @Rui, and @Timo, I'm collecting the final discussion
> > summaries
> > >>> now and want to confirm one thing that for the statement `USE
> MODULES x
> > >> [,
> > >>> y, z, ...]`, if the module name list contains an unexsited module,
> > shall
> > >> we
> > >>> #1 fail the execution for all of them or #2 enabled the rest modules
> > and
> > >>> return a warning to users? My personal preference goes to #1 for
> > >>> simplicity. What do you think?
> > >>>
> > >>> Best,
> > >>> Jane
> > >>>
> > >>> On Tue, Feb 2, 2021 at 3:53 PM Timo Walther <[hidden email]>
> > wrote:
> > >>>
> > >>>> +1
> > >>>>
> > >>>> @Jane Can you summarize our discussion in the JIRA issue?
> > >>>>
> > >>>> Thanks,
> > >>>> Timo
> > >>>>
> > >>>>
> > >>>> On 02.02.21 03:50, Jark Wu wrote:
> > >>>>> Hi Timo,
> > >>>>>
> > >>>>>> Another question is whether a LOAD operation also adds the module
> to
> > >>> the
> > >>>>> enabled list by default?
> > >>>>>
> > >>>>> I would like to add the module to the enabled list by default, the
> > >> main
> > >>>>> reasons are:
> > >>>>> 1) Reordering is an advanced requirement, adding modules needs
> > >>> additional
> > >>>>> USE statements with "core" module
> > >>>>>    sounds too burdensome. Most users should be satisfied with only
> > >> LOAD
> > >>>>> statements.
> > >>>>> 2) We should keep compatible for TableEnvironment#loadModule().
> > >>>>> 3) We are using the LOAD statement instead of CREATE, so I think
> it's
> > >>>> fine
> > >>>>> that it does some implicit things.
> > >>>>>
> > >>>>> Best,
> > >>>>> Jark
> > >>>>>
> > >>>>> On Tue, 2 Feb 2021 at 00:48, Timo Walther <[hidden email]>
> > >> wrote:
> > >>>>>
> > >>>>>> Not the module itself but the ModuleManager should handle this
> case,
> > >>>> yes.
> > >>>>>>
> > >>>>>> Regards,
> > >>>>>> Timo
> > >>>>>>
> > >>>>>>
> > >>>>>> On 01.02.21 17:35, Jane Chan wrote:
> > >>>>>>> +1 to Jark's proposal
> > >>>>>>>
> > >>>>>>>     To make it clearer,  will `module#getFunctionDefinition()`
> > >> return
> > >>>> empty
> > >>>>>>> suppose the module is loaded but not enabled?
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Jane
> > >>>>>>>
> > >>>>>>> On Mon, Feb 1, 2021 at 10:02 PM Timo Walther <[hidden email]
> >
> > >>>> wrote:
> > >>>>>>>
> > >>>>>>>> +1 to Jark's proposal
> > >>>>>>>>
> > >>>>>>>> I like the difference between just loading and actually enabling
> > >>> these
> > >>>>>>>> modules.
> > >>>>>>>>
> > >>>>>>>> @Rui: I would use the same behavior as catalogs here. You cannot
> > >>>> `USE` a
> > >>>>>>>> catalog without creating it before.
> > >>>>>>>>
> > >>>>>>>> Another question is whether a LOAD operation also adds the
> module
> > >> to
> > >>>> the
> > >>>>>>>> enabled list by default?
> > >>>>>>>>
> > >>>>>>>> Regards,
> > >>>>>>>> Timo
> > >>>>>>>>
> > >>>>>>>> On 01.02.21 13:52, Rui Li wrote:
> > >>>>>>>>> If `USE MODULES` implies unloading modules that are not listed,
> > >>> does
> > >>>> it
> > >>>>>>>>> also imply loading modules that are not previously loaded,
> > >>> especially
> > >>>>>>>> since
> > >>>>>>>>> we're mapping modules by name now?
> > >>>>>>>>>
> > >>>>>>>>> On Mon, Feb 1, 2021 at 8:20 PM Jark Wu <[hidden email]>
> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> I agree with Timo that the USE implies the specified modules
> are
> > >>> in
> > >>>>>> use
> > >>>>>>>> in
> > >>>>>>>>>> the specified order and others are not used.
> > >>>>>>>>>> This would be easier to know what's the result list and order
> > >>> after
> > >>>>>> the
> > >>>>>>>> USE
> > >>>>>>>>>> statement.
> > >>>>>>>>>> That means: if current modules in order are x, y, z. And `USE
> > >>>> MODULES
> > >>>>>>>> z, y`
> > >>>>>>>>>> means current modules in order are z, y.
> > >>>>>>>>>>
> > >>>>>>>>>> But I would like to not unload the unmentioned modules in the
> > >> USE
> > >>>>>>>>>> statement. Because it seems strange that USE
> > >>>>>>>>>> will implicitly remove modules. In the above example, the user
> > >> may
> > >>>>>> type
> > >>>>>>>> the
> > >>>>>>>>>> wrong modules list using USE by mistake
> > >>>>>>>>>>      and would like to declare the list again, the user has to
> > >>> create
> > >>>>>> the
> > >>>>>>>>>> module again with some properties he may don't know.
> Therefore,
> > >> I
> > >>>>>>>> propose
> > >>>>>>>>>> the USE statement just specifies the current module lists and
> > >>>> doesn't
> > >>>>>>>>>> unload modules.
> > >>>>>>>>>> Besides that, we may need a new syntax to list all the modules
> > >>>>>> including
> > >>>>>>>>>> not used but loaded.
> > >>>>>>>>>> We can introduce SHOW FULL MODULES for this purpose with an
> > >>>> additional
> > >>>>>>>>>> `used` column.
> > >>>>>>>>>>
> > >>>>>>>>>> For example:
> > >>>>>>>>>>
> > >>>>>>>>>> Flink SQL> list modules:
> > >>>>>>>>>> -----------
> > >>>>>>>>>> | modules |
> > >>>>>>>>>> -----------
> > >>>>>>>>>> | x       |
> > >>>>>>>>>> | y       |
> > >>>>>>>>>> | z       |
> > >>>>>>>>>> -----------
> > >>>>>>>>>> Flink SQL> USE MODULES z, y;
> > >>>>>>>>>> Flink SQL> show modules:
> > >>>>>>>>>> -----------
> > >>>>>>>>>> | modules |
> > >>>>>>>>>> -----------
> > >>>>>>>>>> | z       |
> > >>>>>>>>>> | y       |
> > >>>>>>>>>> -----------
> > >>>>>>>>>> Flink SQL> show FULL modules;
> > >>>>>>>>>> -------------------
> > >>>>>>>>>> | modules |  used |
> > >>>>>>>>>> -------------------
> > >>>>>>>>>> | z       | true  |
> > >>>>>>>>>> | y       | true  |
> > >>>>>>>>>> | x       | false |
> > >>>>>>>>>> -------------------
> > >>>>>>>>>> Flink SQL> USE MODULES z, y, x;
> > >>>>>>>>>> Flink SQL> show modules;
> > >>>>>>>>>> -----------
> > >>>>>>>>>> | modules |
> > >>>>>>>>>> -----------
> > >>>>>>>>>> | z       |
> > >>>>>>>>>> | y       |
> > >>>>>>>>>> | x       |
> > >>>>>>>>>> -----------
> > >>>>>>>>>>
> > >>>>>>>>>> What do you think?
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Jark
> > >>>>>>>>>>
> > >>>>>>>>>> On Mon, 1 Feb 2021 at 19:02, Jane Chan <[hidden email]
> >
> > >>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Hi Timo, thanks for the discussion.
> > >>>>>>>>>>>
> > >>>>>>>>>>> It seems to reach an agreement regarding #3 that <1> Module
> > >> name
> > >>>>>> should
> > >>>>>>>>>>> better be a simple identifier rather than a string literal.
> <2>
> > >>>>>>>> Property
> > >>>>>>>>>>> `type` is redundant and should be removed, and mapping will
> > >> rely
> > >>> on
> > >>>>>> the
> > >>>>>>>>>>> module name because loading a module multiple times just
> using
> > >> a
> > >>>>>>>>>> different
> > >>>>>>>>>>> module name doesn't make much sense. <3> We should migrate to
> > >> the
> > >>>>>> newer
> > >>>>>>>>>> API
> > >>>>>>>>>>> rather than the deprecated `TableFactory` class.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regarding #1, I think the point lies in whether changing the
> > >>>>>> resolution
> > >>>>>>>>>>> order implies an `unload` operation explicitly (i.e., users
> > >> could
> > >>>>>> sense
> > >>>>>>>>>>> it). What do others think?
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>> Jane
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Mon, Feb 1, 2021 at 6:41 PM Timo Walther <
> > >> [hidden email]>
> > >>>>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>>> IMHO I would rather unload the not mentioned modules. The
> > >>>> statement
> > >>>>>>>>>>>> expresses `USE` that implicilty implies that the other
> modules
> > >>> are
> > >>>>>>>> "not
> > >>>>>>>>>>>> used". What do others think?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Regards,
> > >>>>>>>>>>>> Timo
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On 01.02.21 11:28, Jane Chan wrote:
> > >>>>>>>>>>>>> Hi Jark and Rui,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks for the discussions.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Regarding #1, I'm fine with `USE MODULES` syntax, and
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> It can be interpreted as "setting the current order of
> > >>> modules",
> > >>>>>>>>>> which
> > >>>>>>>>>>>> is
> > >>>>>>>>>>>>>> similar to "setting the current catalog" for `USE
> CATALOG`.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>> I would like to confirm that the unmentioned modules remain
> > >> in
> > >>>> the
> > >>>>>>>>>> same
> > >>>>>>>>>>>>> relative order? E.g., if there are three loaded modules
> `X`,
> > >>> `Y`,
> > >>>>>>>>>> `Z`,
> > >>>>>>>>>>>> then
> > >>>>>>>>>>>>> `USE MODULES Y, Z` means shifting the order to `Y`, `Z`,
> `X`.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Regarding #3, I'm fine with mapping modules purely by name,
> > >>> and I
> > >>>>>>>>>> think
> > >>>>>>>>>>>>> Jark raised a good point on making the module name a simple
> > >>>>>>>>>> identifier
> > >>>>>>>>>>>>> instead of a string literal. For backward compatibility,
> > >> since
> > >>> we
> > >>>>>>>>>>> haven't
> > >>>>>>>>>>>>> supported this syntax yet, the affected users are those who
> > >>>> defined
> > >>>>>>>>>>>> modules
> > >>>>>>>>>>>>> in the YAML configuration file. Maybe we can eliminate the
> > >>> 'type'
> > >>>>>>>>>> from
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>> 'requiredContext' to make it optional. Thus the proposed
> > >>> mapping
> > >>>>>>>>>>>> mechanism
> > >>>>>>>>>>>>> could use the module name to lookup the suitable factory,
> > >> and
> > >>> in
> > >>>>>> the
> > >>>>>>>>>>>>> meanwhile updating documentation to encourage users to
> > >> simplify
> > >>>>>> their
> > >>>>>>>>>>>> YAML
> > >>>>>>>>>>>>> configuration. And in the long run, we can deprecate the
> > >>> 'type'.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>> Jane
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Mon, Feb 1, 2021 at 4:19 PM Rui Li <
> [hidden email]
> > >>>
> > >>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks Jane for starting the discussion.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Regarding #1, I also prefer `USE MODULES` syntax. It can
> be
> > >>>>>>>>>>> interpreted
> > >>>>>>>>>>>> as
> > >>>>>>>>>>>>>> "setting the current order of modules", which is similar
> to
> > >>>>>> "setting
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>>> current catalog" for `USE CATALOG`.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Regarding #3, I'm fine to map modules purely by name
> > >> because I
> > >>>>>> think
> > >>>>>>>>>>> it
> > >>>>>>>>>>>>>> satisfies all the use cases we have at hand. But I guess
> we
> > >>> need
> > >>>>>> to
> > >>>>>>>>>>> make
> > >>>>>>>>>>>>>> sure we're backward compatible, i.e. users don't need to
> > >>> change
> > >>>>>>>>>> their
> > >>>>>>>>>>>> yaml
> > >>>>>>>>>>>>>> files to configure the modules.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Mon, Feb 1, 2021 at 3:10 PM Jark Wu <[hidden email]>
> > >>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thanks Jane for the summary and starting the discussion
> in
> > >>> the
> > >>>>>>>>>>> mailing
> > >>>>>>>>>>>>>>> list.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Here are my thoughts:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> 1) syntax to reorder modules
> > >>>>>>>>>>>>>>> I agree with Rui Li it would be quite useful if we can
> have
> > >>>> some
> > >>>>>>>>>>> syntax
> > >>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>> reorder modules.
> > >>>>>>>>>>>>>>> I slightly prefer `USE MODULES x, y, z` than `RELOAD
> > >> MODULES
> > >>> x,
> > >>>>>> y,
> > >>>>>>>>>>> z`,
> > >>>>>>>>>>>>>>> because USE has a more sense of effective and specifying
> > >>>>>> ordering,
> > >>>>>>>>>>> than
> > >>>>>>>>>>>>>>> RELOAD.
> > >>>>>>>>>>>>>>>      From my feeling, RELOAD just means we unregister and
> > >>>> register
> > >>>>>>>>>> x,y,z
> > >>>>>>>>>>>>>> modules
> > >>>>>>>>>>>>>>> again,
> > >>>>>>>>>>>>>>> it sounds like other registered modules are still in use
> > >> and
> > >>> in
> > >>>>>> the
> > >>>>>>>>>>>>>> order.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> 3) mapping modules purely by name
> > >>>>>>>>>>>>>>> This can definitely improve the usability of loading
> > >> modules,
> > >>>>>>>>>> because
> > >>>>>>>>>>>>>>> the 'type=' property
> > >>>>>>>>>>>>>>> looks really redundant. We can think of this as a syntax
> > >>> sugar
> > >>>>>> that
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>>>> default type value is the module name.
> > >>>>>>>>>>>>>>> And we can support to specify 'type=' property in the
> > >> future
> > >>> to
> > >>>>>>>>>> allow
> > >>>>>>>>>>>>>>> multiple modules for one module type.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Besides, I would like to mention one more change, that
> the
> > >>>> module
> > >>>>>>>>>>> name
> > >>>>>>>>>>>>>>> proposed in FLIP-68 is a string literal.
> > >>>>>>>>>>>>>>> But I think we are all on the same page to change it
> into a
> > >>>>>> simple
> > >>>>>>>>>>>>>>> (non-compound) identifier.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> LOAD/UNLOAD MODULE 'core'
> > >>>>>>>>>>>>>>> ==>
> > >>>>>>>>>>>>>>> LOAD/UNLOAD MODULE core
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>> Jark
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Sat, 30 Jan 2021 at 04:00, Jane Chan <
> > >>> [hidden email]
> > >>>>>
> > >>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> I would like to start a discussion on FLINK-21045 [1]
> > >> about
> > >>>>>>>>>>> supporting
> > >>>>>>>>>>>>>>>> `LOAD MODULE` and `UNLOAD MODULE` SQL syntax. It's first
> > >>>>>> proposed
> > >>>>>>>>>> by
> > >>>>>>>>>>>>>>>> FLIP-68 [2] as following.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> -- load a module with the given name and append it to
> the
> > >>> end
> > >>>> of
> > >>>>>>>>>> the
> > >>>>>>>>>>>>>>> module
> > >>>>>>>>>>>>>>>> list
> > >>>>>>>>>>>>>>>> LOAD MODULE 'name' [WITH ('type'='xxx', 'prop'='myProp',
> > >>> ...)]
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> --unload a module by name from the module list and other
> > >>>> modules
> > >>>>>>>>>>>> remain
> > >>>>>>>>>>>>>>> in
> > >>>>>>>>>>>>>>>> the same relative positions
> > >>>>>>>>>>>>>>>> UNLOAD MODULE 'name'
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> After a round of discussion on the Jira ticket, it seems
> > >>> some
> > >>>>>>>>>>>>>> unanswered
> > >>>>>>>>>>>>>>>> questions need more opinions and suggestions.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> 1. The way to redefine resolution order easily
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>          Rui Li suggested introducing `USE MODULES` and
> > >>> adding
> > >>>>>>>> similar
> > >>>>>>>>>>>>>>>> functionality to the API because
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>       1) It's very tedious to unload old modules just
> to
> > >>>> reorder
> > >>>>>>>>>> them.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>       2) Users may not even know how to "re-load" an old
> > >>> module
> > >>>>>> if it
> > >>>>>>>>>>> was
> > >>>>>>>>>>>>>> not
> > >>>>>>>>>>>>>>>>> initially loaded by the user, e.g. don't know which
> type
> > >> to
> > >>>>>> use.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>          Jane Chan wondered that module is not like the
> > >>> catalog
> > >>>>>> which
> > >>>>>>>>>>> has
> > >>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>> concept of namespace could specify, and `USE` sounds
> like
> > >> a
> > >>>>>>>>>>>>>>>> mutual-exclusive concept.
> > >>>>>>>>>>>>>>>>          Maybe `RELOAD MODULES` can express upgrading
> the
> > >>>>>> priority of
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>>>> loaded
> > >>>>>>>>>>>>>>>> module(s).
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> 2. `LOAD/UNLOAD MODULE` v.s. `CREATE/DROP MODULE` syntax
> > >>>>>>>>>>>>>>>>          Jark Wu and Nicholas Jiang proposed to use
> > >>>> `CREATE/DROP
> > >>>>>>>>>> MODULE`
> > >>>>>>>>>>>>>>> instead
> > >>>>>>>>>>>>>>>> of `LOAD/UNLOAD MODULE` because
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>       1) From a pure SQL user's perspective, maybe
> > `CREATE
> > >>>>>> MODULE +
> > >>>>>>>>>> USE
> > >>>>>>>>>>>>>>>> MODULE`
> > >>>>>>>>>>>>>>>>> is easier to use rather than `LOAD/UNLOAD`.
> > >>>>>>>>>>>>>>>>>       2) This will be very similar to what the catalog
> > >> used
> > >>>> now.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>        Timo Walther would rather stick to the agreed
> > design
> > >>>>>> because
> > >>>>>>>>>>>>>>>> loading/unloading modules is a concept known from
> kernels
> > >>> etc.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> 3. Simplify the module design by mapping modules purely
> by
> > >>>> name
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> LOAD MODULE geo_utils
> > >>>>>>>>>>>>>>>> LOAD MODULE hive WITH ('version'='2.1')  -- no dedicated
> > >>>>>>>>>>>>>>> 'type='/'module='
> > >>>>>>>>>>>>>>>> but allow only 1 module to be loaded parameterized
> > >>>>>>>>>>>>>>>> UNLOAD hive
> > >>>>>>>>>>>>>>>> USE MODULES hive, core
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Please find more details in the reference link. Looking
> > >>>> forward
> > >>>>>> to
> > >>>>>>>>>>>> your
> > >>>>>>>>>>>>>>>> feedback.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/FLINK-21045#
> > >>>>>>>>>>>>>>>> <
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> [2]
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-68%3A+Extend+Core+Table+System+with+Pluggable+Modules
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>> Jane
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>> Best regards!
> > >>>>>>>>>>>>>> Rui Li
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >
> >
> >
>
12