Using secure cluster resources without authentication

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Using secure cluster resources without authentication

stefanobaghino
Hello everybody,

last week I've run some tests on a secure cluster and I noticed that an
unauthenticated user can submit a Flink job that will only eventually fail
if the job tries to access secured resources (e.g. HDFS). This doesn't
prevent however the user to consume resources of the secure cluster without
authentication (I tried it with the WordCount example).

I'd say this is a bug; is there a reason for this? If you share my feeling
on this, I pinpointed the code that's responsible for this and the fix
seems trivial, I can open an issue and a PR today. Thanks!

--
BR,
Stefano Baghino

Software Engineer @ Radicalbit
Reply | Threaded
Open this post in threaded view
|

Re: Using secure cluster resources without authentication

Robert Metzger
Hi Stefano,

what exactly do you mean by a secure cluster?
A Flink on YARN session in a secured YARN cluster?
A standalone Flink cluster with access to a secured HDFS?

Your observation is right. We are not check if a job submitted by any user
is running in the same security context as the Flink cluster.


On Thu, May 5, 2016 at 11:57 AM, Stefano Baghino <
[hidden email]> wrote:

> Hello everybody,
>
> last week I've run some tests on a secure cluster and I noticed that an
> unauthenticated user can submit a Flink job that will only eventually fail
> if the job tries to access secured resources (e.g. HDFS). This doesn't
> prevent however the user to consume resources of the secure cluster without
> authentication (I tried it with the WordCount example).
>
> I'd say this is a bug; is there a reason for this? If you share my feeling
> on this, I pinpointed the code that's responsible for this and the fix
> seems trivial, I can open an issue and a PR today. Thanks!
>
> --
> BR,
> Stefano Baghino
>
> Software Engineer @ Radicalbit
>
Reply | Threaded
Open this post in threaded view
|

Re: Using secure cluster resources without authentication

stefanobaghino
Apologies for being too generic: with "secure" cluster I mean a Flink
cluster that has been launched with Kerberos credentials (both on YARN or
with the standalone scheduler), thus having access to resources on the
cluster that require authentication (like HDFS).

Without having to run jobs on behalf of an authenticated user (which is
another kind of problem), the facilities to perform a check that the
submitter is authenticated are already in place
(CliFrontend::parseParameters, the branch of the switch-case statement that
handles the "run" command) and requiring a submission to come from an
authenticated user should come almost for free.

On Thu, May 5, 2016 at 1:18 PM, Robert Metzger <[hidden email]> wrote:

> Hi Stefano,
>
> what exactly do you mean by a secure cluster?
> A Flink on YARN session in a secured YARN cluster?
> A standalone Flink cluster with access to a secured HDFS?
>
> Your observation is right. We are not check if a job submitted by any user
> is running in the same security context as the Flink cluster.
>
>
> On Thu, May 5, 2016 at 11:57 AM, Stefano Baghino <
> [hidden email]> wrote:
>
> > Hello everybody,
> >
> > last week I've run some tests on a secure cluster and I noticed that an
> > unauthenticated user can submit a Flink job that will only eventually
> fail
> > if the job tries to access secured resources (e.g. HDFS). This doesn't
> > prevent however the user to consume resources of the secure cluster
> without
> > authentication (I tried it with the WordCount example).
> >
> > I'd say this is a bug; is there a reason for this? If you share my
> feeling
> > on this, I pinpointed the code that's responsible for this and the fix
> > seems trivial, I can open an issue and a PR today. Thanks!
> >
> > --
> > BR,
> > Stefano Baghino
> >
> > Software Engineer @ Radicalbit
> >
>



--
BR,
Stefano Baghino

Software Engineer @ Radicalbit
Reply | Threaded
Open this post in threaded view
|

Re: Using secure cluster resources without authentication

Robert Metzger
I'm not sure if doing the check in the CliFrontend is really effective. A
"hacker" could just create a custom flink build without that check and
still submit a job to the job manager.



On Thu, May 5, 2016 at 2:51 PM, Stefano Baghino <
[hidden email]> wrote:

> Apologies for being too generic: with "secure" cluster I mean a Flink
> cluster that has been launched with Kerberos credentials (both on YARN or
> with the standalone scheduler), thus having access to resources on the
> cluster that require authentication (like HDFS).
>
> Without having to run jobs on behalf of an authenticated user (which is
> another kind of problem), the facilities to perform a check that the
> submitter is authenticated are already in place
> (CliFrontend::parseParameters, the branch of the switch-case statement that
> handles the "run" command) and requiring a submission to come from an
> authenticated user should come almost for free.
>
> On Thu, May 5, 2016 at 1:18 PM, Robert Metzger <[hidden email]>
> wrote:
>
> > Hi Stefano,
> >
> > what exactly do you mean by a secure cluster?
> > A Flink on YARN session in a secured YARN cluster?
> > A standalone Flink cluster with access to a secured HDFS?
> >
> > Your observation is right. We are not check if a job submitted by any
> user
> > is running in the same security context as the Flink cluster.
> >
> >
> > On Thu, May 5, 2016 at 11:57 AM, Stefano Baghino <
> > [hidden email]> wrote:
> >
> > > Hello everybody,
> > >
> > > last week I've run some tests on a secure cluster and I noticed that an
> > > unauthenticated user can submit a Flink job that will only eventually
> > fail
> > > if the job tries to access secured resources (e.g. HDFS). This doesn't
> > > prevent however the user to consume resources of the secure cluster
> > without
> > > authentication (I tried it with the WordCount example).
> > >
> > > I'd say this is a bug; is there a reason for this? If you share my
> > feeling
> > > on this, I pinpointed the code that's responsible for this and the fix
> > > seems trivial, I can open an issue and a PR today. Thanks!
> > >
> > > --
> > > BR,
> > > Stefano Baghino
> > >
> > > Software Engineer @ Radicalbit
> > >
> >
>
>
>
> --
> BR,
> Stefano Baghino
>
> Software Engineer @ Radicalbit
>
Reply | Threaded
Open this post in threaded view
|

Re: Using secure cluster resources without authentication

Eron Wright
I believe that to really protect the cluster from unauthorized use requires that the cluster endpoints (notably Akka) perform an authorization check.  The 'secure flink’ design doc outlines various measures to achieve that.

Stefano I’ll reach out to have a sync-up meeting and to incorporate your feedback.

Thanks!

> On May 17, 2016, at 8:36 AM, Robert Metzger <[hidden email]> wrote:
>
> I'm not sure if doing the check in the CliFrontend is really effective. A
> "hacker" could just create a custom flink build without that check and
> still submit a job to the job manager.
>
>
>
> On Thu, May 5, 2016 at 2:51 PM, Stefano Baghino <
> [hidden email]> wrote:
>
>> Apologies for being too generic: with "secure" cluster I mean a Flink
>> cluster that has been launched with Kerberos credentials (both on YARN or
>> with the standalone scheduler), thus having access to resources on the
>> cluster that require authentication (like HDFS).
>>
>> Without having to run jobs on behalf of an authenticated user (which is
>> another kind of problem), the facilities to perform a check that the
>> submitter is authenticated are already in place
>> (CliFrontend::parseParameters, the branch of the switch-case statement that
>> handles the "run" command) and requiring a submission to come from an
>> authenticated user should come almost for free.
>>
>> On Thu, May 5, 2016 at 1:18 PM, Robert Metzger <[hidden email]>
>> wrote:
>>
>>> Hi Stefano,
>>>
>>> what exactly do you mean by a secure cluster?
>>> A Flink on YARN session in a secured YARN cluster?
>>> A standalone Flink cluster with access to a secured HDFS?
>>>
>>> Your observation is right. We are not check if a job submitted by any
>> user
>>> is running in the same security context as the Flink cluster.
>>>
>>>
>>> On Thu, May 5, 2016 at 11:57 AM, Stefano Baghino <
>>> [hidden email]> wrote:
>>>
>>>> Hello everybody,
>>>>
>>>> last week I've run some tests on a secure cluster and I noticed that an
>>>> unauthenticated user can submit a Flink job that will only eventually
>>> fail
>>>> if the job tries to access secured resources (e.g. HDFS). This doesn't
>>>> prevent however the user to consume resources of the secure cluster
>>> without
>>>> authentication (I tried it with the WordCount example).
>>>>
>>>> I'd say this is a bug; is there a reason for this? If you share my
>>> feeling
>>>> on this, I pinpointed the code that's responsible for this and the fix
>>>> seems trivial, I can open an issue and a PR today. Thanks!
>>>>
>>>> --
>>>> BR,
>>>> Stefano Baghino
>>>>
>>>> Software Engineer @ Radicalbit
>>>>
>>>
>>
>>
>>
>> --
>> BR,
>> Stefano Baghino
>>
>> Software Engineer @ Radicalbit
>>