[DISCUSS] Flink Kerberos Improvement

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Flink Kerberos Improvement

Rong Rong
Hi All,

We have been experimenting integration of Kerberos with Flink in our Corp
environment and found out some limitations on the current Flink-Kerberos
security mechanism running with Apache YARN.

Based on the Hadoop Kerberos security guide [1]. Apparently there are only
a subset of the suggested long-running service security mechanism is
supported in Flink. Furthermore, the current model does not work well with
superuser impersonating actual users [2] for deployment purposes, which is
a widely adopted way to launch application in corp environments.

We would like to propose an improvement [3] to introduce the other comment
methods [1] for securing long-running application on YARN and enable
impersonation mode. Any comments and suggestions are highly appreciated.

Many thanks,
Rong

[1]
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
[2]
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
[3]
https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Kerberos Improvement

Shuyi Chen
Hi Rong, thanks a lot for the proposal. Currently, Flink assume the keytab
is located in a remote DFS. Pre-installing Keytabs statically in YARN node
local filesystem is a common approach, so I think we should support this
mode in Flink natively. As an optimazation to reduce the KDC access
frequency, we should also support method 3 (the DT approach) as discussed
in [1]. A question is that why do we need to implement impersonation in
Flink? I assume the superuser can do the impersonation for 'joe' and 'joe'
can then invoke Flink client to deploy the job. Thanks a lot.

Shuyi

[1]
https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit

On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <[hidden email]> wrote:

> Hi All,
>
> We have been experimenting integration of Kerberos with Flink in our Corp
> environment and found out some limitations on the current Flink-Kerberos
> security mechanism running with Apache YARN.
>
> Based on the Hadoop Kerberos security guide [1]. Apparently there are only
> a subset of the suggested long-running service security mechanism is
> supported in Flink. Furthermore, the current model does not work well with
> superuser impersonating actual users [2] for deployment purposes, which is
> a widely adopted way to launch application in corp environments.
>
> We would like to propose an improvement [3] to introduce the other comment
> methods [1] for securing long-running application on YARN and enable
> impersonation mode. Any comments and suggestions are highly appreciated.
>
> Many thanks,
> Rong
>
> [1]
>
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
> [2]
>
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> [3]
>
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
>


--
"So you have to trust that the dots will somehow connect in your future."
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Kerberos Improvement

Rong Rong
Hi Shuyi,

Yes. I think the impersonation is a very much valid question! This can
actually be considered as 2 questions as I stated in the doc.
1. In the doc I stated that impersonation should be implemented on the
user-side code and should only invoke the cluster client as the actual user
joe'.
2. However, since currently the cluster client assumes no impersonation at
all, many of the code assumes that a fully authorized client can be
instantiated with the same authority that the actual Flink cluster has.
When impersonation is enabled, this might not be the case. For example, if
impersonation is in place, most likely the cluster client running on joe's
behalf will not, and should not have access to keytab file of 'joe'.
Instead, a delegation token is used. Thus the second part of the doc is
trying to address this issue.

--
Rong

On Mon, Dec 17, 2018 at 11:41 PM Shuyi Chen <[hidden email]> wrote:

> Hi Rong, thanks a lot for the proposal. Currently, Flink assume the keytab
> is located in a remote DFS. Pre-installing Keytabs statically in YARN node
> local filesystem is a common approach, so I think we should support this
> mode in Flink natively. As an optimazation to reduce the KDC access
> frequency, we should also support method 3 (the DT approach) as discussed
> in [1]. A question is that why do we need to implement impersonation in
> Flink? I assume the superuser can do the impersonation for 'joe' and 'joe'
> can then invoke Flink client to deploy the job. Thanks a lot.
>
> Shuyi
>
> [1]
>
> https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit
>
> On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <[hidden email]> wrote:
>
> > Hi All,
> >
> > We have been experimenting integration of Kerberos with Flink in our Corp
> > environment and found out some limitations on the current Flink-Kerberos
> > security mechanism running with Apache YARN.
> >
> > Based on the Hadoop Kerberos security guide [1]. Apparently there are
> only
> > a subset of the suggested long-running service security mechanism is
> > supported in Flink. Furthermore, the current model does not work well
> with
> > superuser impersonating actual users [2] for deployment purposes, which
> is
> > a widely adopted way to launch application in corp environments.
> >
> > We would like to propose an improvement [3] to introduce the other
> comment
> > methods [1] for securing long-running application on YARN and enable
> > impersonation mode. Any comments and suggestions are highly appreciated.
> >
> > Many thanks,
> > Rong
> >
> > [1]
> >
> >
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
> > [2]
> >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> > [3]
> >
> >
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
> >
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Kerberos Improvement

Stephan Ewen
Hi all!

A quick question: Is this a special case of the security improvements
proposed in this thread [1], or a separate proposal all together?

Stephan

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html

On Tue, Dec 18, 2018 at 8:06 PM Rong Rong <[hidden email]> wrote:

> Hi Shuyi,
>
> Yes. I think the impersonation is a very much valid question! This can
> actually be considered as 2 questions as I stated in the doc.
> 1. In the doc I stated that impersonation should be implemented on the
> user-side code and should only invoke the cluster client as the actual user
> joe'.
> 2. However, since currently the cluster client assumes no impersonation at
> all, many of the code assumes that a fully authorized client can be
> instantiated with the same authority that the actual Flink cluster has.
> When impersonation is enabled, this might not be the case. For example, if
> impersonation is in place, most likely the cluster client running on joe's
> behalf will not, and should not have access to keytab file of 'joe'.
> Instead, a delegation token is used. Thus the second part of the doc is
> trying to address this issue.
>
> --
> Rong
>
> On Mon, Dec 17, 2018 at 11:41 PM Shuyi Chen <[hidden email]> wrote:
>
> > Hi Rong, thanks a lot for the proposal. Currently, Flink assume the
> keytab
> > is located in a remote DFS. Pre-installing Keytabs statically in YARN
> node
> > local filesystem is a common approach, so I think we should support this
> > mode in Flink natively. As an optimazation to reduce the KDC access
> > frequency, we should also support method 3 (the DT approach) as discussed
> > in [1]. A question is that why do we need to implement impersonation in
> > Flink? I assume the superuser can do the impersonation for 'joe' and
> 'joe'
> > can then invoke Flink client to deploy the job. Thanks a lot.
> >
> > Shuyi
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit
> >
> > On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <[hidden email]> wrote:
> >
> > > Hi All,
> > >
> > > We have been experimenting integration of Kerberos with Flink in our
> Corp
> > > environment and found out some limitations on the current
> Flink-Kerberos
> > > security mechanism running with Apache YARN.
> > >
> > > Based on the Hadoop Kerberos security guide [1]. Apparently there are
> > only
> > > a subset of the suggested long-running service security mechanism is
> > > supported in Flink. Furthermore, the current model does not work well
> > with
> > > superuser impersonating actual users [2] for deployment purposes, which
> > is
> > > a widely adopted way to launch application in corp environments.
> > >
> > > We would like to propose an improvement [3] to introduce the other
> > comment
> > > methods [1] for securing long-running application on YARN and enable
> > > impersonation mode. Any comments and suggestions are highly
> appreciated.
> > >
> > > Many thanks,
> > > Rong
> > >
> > > [1]
> > >
> > >
> >
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
> > > [2]
> > >
> > >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> > > [3]
> > >
> > >
> >
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
> > >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Kerberos Improvement

Rong Rong
Hi Stephan,

This proposal is an extension of @shuyi's initial improvement specifically
to tackle Kerberos related issues.
However in order for this extension to work, some of the original
components proposed are required (such as the service provider pattern for
security factories).

Thanks,
Rong

On Thu, Feb 14, 2019 at 1:35 AM Stephan Ewen <[hidden email]> wrote:

> Hi all!
>
> A quick question: Is this a special case of the security improvements
> proposed in this thread [1], or a separate proposal all together?
>
> Stephan
>
> [1]
>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html
>
> On Tue, Dec 18, 2018 at 8:06 PM Rong Rong <[hidden email]> wrote:
>
> > Hi Shuyi,
> >
> > Yes. I think the impersonation is a very much valid question! This can
> > actually be considered as 2 questions as I stated in the doc.
> > 1. In the doc I stated that impersonation should be implemented on the
> > user-side code and should only invoke the cluster client as the actual
> user
> > joe'.
> > 2. However, since currently the cluster client assumes no impersonation
> at
> > all, many of the code assumes that a fully authorized client can be
> > instantiated with the same authority that the actual Flink cluster has.
> > When impersonation is enabled, this might not be the case. For example,
> if
> > impersonation is in place, most likely the cluster client running on
> joe's
> > behalf will not, and should not have access to keytab file of 'joe'.
> > Instead, a delegation token is used. Thus the second part of the doc is
> > trying to address this issue.
> >
> > --
> > Rong
> >
> > On Mon, Dec 17, 2018 at 11:41 PM Shuyi Chen <[hidden email]> wrote:
> >
> > > Hi Rong, thanks a lot for the proposal. Currently, Flink assume the
> > keytab
> > > is located in a remote DFS. Pre-installing Keytabs statically in YARN
> > node
> > > local filesystem is a common approach, so I think we should support
> this
> > > mode in Flink natively. As an optimazation to reduce the KDC access
> > > frequency, we should also support method 3 (the DT approach) as
> discussed
> > > in [1]. A question is that why do we need to implement impersonation in
> > > Flink? I assume the superuser can do the impersonation for 'joe' and
> > 'joe'
> > > can then invoke Flink client to deploy the job. Thanks a lot.
> > >
> > > Shuyi
> > >
> > > [1]
> > >
> > >
> >
> https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit
> > >
> > > On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <[hidden email]> wrote:
> > >
> > > > Hi All,
> > > >
> > > > We have been experimenting integration of Kerberos with Flink in our
> > Corp
> > > > environment and found out some limitations on the current
> > Flink-Kerberos
> > > > security mechanism running with Apache YARN.
> > > >
> > > > Based on the Hadoop Kerberos security guide [1]. Apparently there are
> > > only
> > > > a subset of the suggested long-running service security mechanism is
> > > > supported in Flink. Furthermore, the current model does not work well
> > > with
> > > > superuser impersonating actual users [2] for deployment purposes,
> which
> > > is
> > > > a widely adopted way to launch application in corp environments.
> > > >
> > > > We would like to propose an improvement [3] to introduce the other
> > > comment
> > > > methods [1] for securing long-running application on YARN and enable
> > > > impersonation mode. Any comments and suggestions are highly
> > appreciated.
> > > >
> > > > Many thanks,
> > > > Rong
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
> > > > [2]
> > > >
> > > >
> > >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> > > > [3]
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
> > > >
> > >
> > >
> > > --
> > > "So you have to trust that the dots will somehow connect in your
> future."
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

回复:[DISCUSS] Flink Kerberos Improvement

杨弢(杨弢)
In reply to this post by Rong Rong

Hi, all!
We have met some similar security requirements and did some investigation on security strategies, the third strategy (AM keytab distributed via YARN; AM regenerates delegation tokens for containers.) mentioned in YARN security doc is already used by Spark1.5+ and we quite agree with that it's necessary to be supported in Flink. Moreover, we would like to see the security improvements in Flink can be properly applied on other resource management systems like k8s etc. (BTW. we have did some work to let Flink application natively run on k8s cluster). We are going to do some work on this and hope it can help for finding a more generic solution. Thanks!
Tao Yang


------------------------------------------------------------------
发件人:Rong Rong <[hidden email]>
发送时间:2018年12月19日(星期三) 03:06
收件人:dev <[hidden email]>
主 题:Re: [DISCUSS] Flink Kerberos Improvement

Hi Shuyi,

Yes. I think the impersonation is a very much valid question! This can
actually be considered as 2 questions as I stated in the doc.
1. In the doc I stated that impersonation should be implemented on the
user-side code and should only invoke the cluster client as the actual user
joe'.
2. However, since currently the cluster client assumes no impersonation at
all, many of the code assumes that a fully authorized client can be
instantiated with the same authority that the actual Flink cluster has.
When impersonation is enabled, this might not be the case. For example, if
impersonation is in place, most likely the cluster client running on joe's
behalf will not, and should not have access to keytab file of 'joe'.
Instead, a delegation token is used. Thus the second part of the doc is
trying to address this issue.

--
Rong

On Mon, Dec 17, 2018 at 11:41 PM Shuyi Chen <[hidden email]> wrote:

> Hi Rong, thanks a lot for the proposal. Currently, Flink assume the keytab
> is located in a remote DFS. Pre-installing Keytabs statically in YARN node
> local filesystem is a common approach, so I think we should support this
> mode in Flink natively. As an optimazation to reduce the KDC access
> frequency, we should also support method 3 (the DT approach) as discussed
> in [1]. A question is that why do we need to implement impersonation in
> Flink? I assume the superuser can do the impersonation for 'joe' and 'joe'
> can then invoke Flink client to deploy the job. Thanks a lot.
>
> Shuyi
>
> [1]
>
> https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit
>
> On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <[hidden email]> wrote:
>
> > Hi All,
> >
> > We have been experimenting integration of Kerberos with Flink in our Corp
> > environment and found out some limitations on the current Flink-Kerberos
> > security mechanism running with Apache YARN.
> >
> > Based on the Hadoop Kerberos security guide [1]. Apparently there are
> only
> > a subset of the suggested long-running service security mechanism is
> > supported in Flink. Furthermore, the current model does not work well
> with
> > superuser impersonating actual users [2] for deployment purposes, which
> is
> > a widely adopted way to launch application in corp environments.
> >
> > We would like to propose an improvement [3] to introduce the other
> comment
> > methods [1] for securing long-running application on YARN and enable
> > impersonation mode. Any comments and suggestions are highly appreciated.
> >
> > Many thanks,
> > Rong
> >
> > [1]
> >
> >
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
> > [2]
> >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> > [3]
> >
> >
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
> >
>
>
> --
> "So you have to trust that the dots will somehow connect in your future."
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Flink Kerberos Improvement

Rong Rong
Hi Tao,

Thanks for the comments and suggestions. Yes. I agree that the security
improvement should be properly applied on other cluster management systems
if designed properly.

I am not very familiar with the K8s security setup, but most of the changes
we proposal should be generic enough to apply to all resource management
systems.
Please kindly take a look at one of implementation [1] of another the
design initiative [2] we had. It would be great if you can provide any
additional comments or suggestions on that design doc as well.

Many Thanks,
Rong

--

[1] https://issues.apache.org/jira/browse/FLINK-11589
[2]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-security-improvements-td21068.html

On Thu, Mar 21, 2019 at 7:00 AM 杨弢(杨弢) <[hidden email]> wrote:

>
> Hi, all!
> We have met some similar security requirements and did some investigation
> on security strategies, the third strategy (AM keytab distributed via YARN;
> AM regenerates delegation tokens for containers.) mentioned in YARN
> security doc is already used by Spark1.5+ and we quite agree with that it's
> necessary to be supported in Flink. Moreover, we would like to see the
> security improvements in Flink can be properly applied on other resource
> management systems like k8s etc. (BTW. we have did some work to let Flink
> application natively run on k8s cluster). We are going to do some work on
> this and hope it can help for finding a more generic solution. Thanks!
> Tao Yang
>
>
> ------------------------------------------------------------------
> 发件人:Rong Rong <[hidden email]>
> 发送时间:2018年12月19日(星期三) 03:06
> 收件人:dev <[hidden email]>
> 主 题:Re: [DISCUSS] Flink Kerberos Improvement
>
> Hi Shuyi,
>
> Yes. I think the impersonation is a very much valid question! This can
> actually be considered as 2 questions as I stated in the doc.
> 1. In the doc I stated that impersonation should be implemented on the
> user-side code and should only invoke the cluster client as the actual user
> joe'.
> 2. However, since currently the cluster client assumes no impersonation at
> all, many of the code assumes that a fully authorized client can be
> instantiated with the same authority that the actual Flink cluster has.
> When impersonation is enabled, this might not be the case. For example, if
> impersonation is in place, most likely the cluster client running on joe's
> behalf will not, and should not have access to keytab file of 'joe'.
> Instead, a delegation token is used. Thus the second part of the doc is
> trying to address this issue.
>
> --
> Rong
>
> On Mon, Dec 17, 2018 at 11:41 PM Shuyi Chen <[hidden email]> wrote:
>
> > Hi Rong, thanks a lot for the proposal. Currently, Flink assume the
> keytab
> > is located in a remote DFS. Pre-installing Keytabs statically in YARN
> node
> > local filesystem is a common approach, so I think we should support this
> > mode in Flink natively. As an optimazation to reduce the KDC access
> > frequency, we should also support method 3 (the DT approach) as discussed
> > in [1]. A question is that why do we need to implement impersonation in
> > Flink? I assume the superuser can do the impersonation for 'joe' and
> 'joe'
> > can then invoke Flink client to deploy the job. Thanks a lot.
> >
> > Shuyi
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/10V7LiNlUJKeKZ58mkR7oVv1t6BrC6TZi3FGf2Dm6-i8/edit
> >
> > On Mon, Dec 17, 2018 at 5:49 PM Rong Rong <[hidden email]> wrote:
> >
> > > Hi All,
> > >
> > > We have been experimenting integration of Kerberos with Flink in our
> Corp
> > > environment and found out some limitations on the current
> Flink-Kerberos
> > > security mechanism running with Apache YARN.
> > >
> > > Based on the Hadoop Kerberos security guide [1]. Apparently there are
> > only
> > > a subset of the suggested long-running service security mechanism is
> > > supported in Flink. Furthermore, the current model does not work well
> > with
> > > superuser impersonating actual users [2] for deployment purposes, which
> > is
> > > a widely adopted way to launch application in corp environments.
> > >
> > > We would like to propose an improvement [3] to introduce the other
> > comment
> > > methods [1] for securing long-running application on YARN and enable
> > > impersonation mode. Any comments and suggestions are highly
> appreciated.
> > >
> > > Many thanks,
> > > Rong
> > >
> > > [1]
> > >
> > >
> >
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html#Securing_Long-lived_YARN_Services
> > > [2]
> > >
> > >
> >
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html
> > > [3]
> > >
> > >
> >
> https://docs.google.com/document/d/1rBLCpyQKg6Ld2P0DEgv4VIOMTwv4sitd7h7P5r202IE/edit?usp=sharing
> > >
> >
> >
> > --
> > "So you have to trust that the dots will somehow connect in your future."
> >
>