Flip 6 mesos support

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Flip 6 mesos support

Till Rohrmann
The resources consumed by the JobMaster can be specified by
`jobmanager.heap.mb`.

Cheers,
Till

On Wed, Mar 21, 2018 at 3:20 PM, Renjie Liu <[hidden email]> wrote:

> Hi, Till:
>
> In fact, I want to ask the resources consume by job manager
>
> Till Rohrmann <[hidden email]> 于 2018年3月21日周三 下午8:17写道:
>
> > As many as the application needs to run. If you start a job with
> > parallelism 10 then it will ask for 10 slots (assuming slot sharing).
> >
> > On Wed, Mar 21, 2018 at 12:04 PM, Renjie Liu <[hidden email]>
> > wrote:
> >
> > > So how many slots a job manager may consume?
> > >
> > > On Wed, Mar 21, 2018 at 6:50 PM Till Rohrmann <[hidden email]>
> > > wrote:
> > >
> > > > At the moment this is not possible. In order to do this, you will
> have
> > to
> > > > use the per-job mode and run each job on a dedicated Flink cluster.
> > > >
> > > > On Wed, Mar 21, 2018 at 11:33 AM, Renjie Liu <
> [hidden email]>
> > > > wrote:
> > > >
> > > > > For example, we have 2 jobs.
> > > > > For job 1, I want to start job manger with 1 CPU and 100M memory.
> > Job 1
> > > > > need s10 slots, and I want to deploy these 10 slots in 2 task
> > managers,
> > > > > each with 5 cores and 1G memory.
> > > > >
> > > > > For job 2, I want to start job manager with 2 CPU and 200M memory.
> > Job
> > > 2
> > > > > needs 100 slots and I want to deploy these 100 slot in 10 task
> > > managers,
> > > > > each with 10 cores and 2G memory.
> > > > >
> > > > > Is this possible?
> > > > >
> > > > > On Wed, Mar 21, 2018 at 6:19 PM Till Rohrmann <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Hi Renjie,
> > > > > >
> > > > > > what do you mean with specifying different JM and TM resources
> for
> > > > > > different jobs exactly?
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Wed, Mar 21, 2018 at 10:55 AM, Renjie Liu <
> > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi, Till:
> > > > > > >
> > > > > > > How to specify job manager and task manager resources for
> > different
> > > > > jobs
> > > > > > in
> > > > > > > session mode?
> > > > > > >
> > > > > > > On Sun, Mar 18, 2018 at 1:10 AM Till Rohrmann <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Shuyi,
> > > > > > > >
> > > > > > > > best if you look at the other e2e tests in the
> > > > flink-end-to-end-tests
> > > > > > > > module. For example the Kafka e2e test under
> > > > > > > > flink/flink-end-to-end-tests/test-scripts/test_streaming_
> > > > > kafka010.sh.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Till
> > > > > > > >
> > > > > > > > On Fri, Mar 16, 2018 at 10:20 PM, Shuyi Chen <
> > [hidden email]
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Till,
> > > > > > > > >
> > > > > > > > > For FLINK-8562, the test is passing now because it's not
> > really
> > > > > > > > > checking the right thing.
> > > > > > > > >
> > > > > > > > > Yes, I can help with the Kerberos integration ticket.
> > > > > > > > >
> > > > > > > > > Is there an example on how the e2e test should be
> structured
> > > and
> > > > > > > invoked?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > > Shuyi
> > > > > > > > >
> > > > > > > > > On Fri, Mar 16, 2018 at 6:51 AM, Till Rohrmann <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Shuyi,
> > > > > > > > > >
> > > > > > > > > > thanks for the working on FLINK-8562. Once this issue is
> > > fixed,
> > > > > it
> > > > > > > will
> > > > > > > > > > automatically be executed on the Flip-6 components. In
> fact
> > > it
> > > > is
> > > > > > > > already
> > > > > > > > > > being executed on Flip-6.
> > > > > > > > > >
> > > > > > > > > > But what you could help the community with is setting up
> an
> > > > > > automated
> > > > > > > > > > end-to-end test for the Kerberos integration if you want:
> > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8981.
> > > > > > > > > >
> > > > > > > > > > The Flink community is currently working on automating
> more
> > > and
> > > > > > more
> > > > > > > > > tests
> > > > > > > > > > in order to facilitate faster releases and improve the
> test
> > > > > > coverage.
> > > > > > > > You
> > > > > > > > > > can find more about this effort here:
> > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8970.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Till
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 15, 2018 at 8:45 PM, Shuyi Chen <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Till,
> > > > > > > > > > >
> > > > > > > > > > > This is Shuyi :) Thanks a lot. In FLINK-8562, I already
> > > sent
> > > > a
> > > > > PR
> > > > > > > to
> > > > > > > > > > > resolve the issue, your help to take a look will be
> > great.
> > > > > > > > > > >
> > > > > > > > > > > Please let me know what I can help to test the Kerberos
> > > > > > > > > authentication, I
> > > > > > > > > > > am decently familiar with the Kerberos and YARN
> security
> > > part
> > > > > in
> > > > > > > > Flink.
> > > > > > > > > > >
> > > > > > > > > > > As a starting point, I'd suggest to add an integration
> > test
> > > > > > similar
> > > > > > > > to
> > > > > > > > > > > YARNSessionFIFOSecuredITCase
> > > > > > > > > > > for flip6.
> > > > > > > > > > >
> > > > > > > > > > > Shuyi
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 15, 2018 at 5:44 AM, Till Rohrmann <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > >
> > > > > > > > > > > > thanks for the pointer with the
> > > > YARNSessionFIFOSecuredITCase.
> > > > > > > > You're
> > > > > > > > > > > right
> > > > > > > > > > > > that we should fix this test. There is FLINK-8562
> which
> > > > seems
> > > > > > to
> > > > > > > > > > address
> > > > > > > > > > > > the problem. Will take a look.
> > > > > > > > > > > >
> > > > > > > > > > > > Additionally, we want to test Kerberos authentication
> > > > > > explicitly
> > > > > > > as
> > > > > > > > > > part
> > > > > > > > > > > of
> > > > > > > > > > > > the release testing for Flink 1.5. I will shortly
> send
> > > > > around a
> > > > > > > > mail
> > > > > > > > > > > where
> > > > > > > > > > > > I will lay out the ongoing testing efforts and where
> > more
> > > > is
> > > > > > > > needed.
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Till
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 15, 2018 at 7:37 AM, Renjie Liu <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the clarification
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Mar 15, 2018 at 2:30 PM 周思华 <
> > > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > if I am not misunderstand, you just need to start
> > the
> > > > > > cluster
> > > > > > > > as
> > > > > > > > > > > normal
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > before. The dispatcher and resourcemanager are
> > > spawned
> > > > by
> > > > > > > > > > > > > ClusterEntryPoint
> > > > > > > > > > > > > > (you can have a look at yarn-session.sh &
> > > > > > > FlinkYarnSessionCli &
> > > > > > > > > > > > > > YarnSessionClusterEntrypoint), and the TM are
> > spawned
> > > > by
> > > > > > > > > > > > ResourceManager
> > > > > > > > > > > > > > lazily (ResourceManager will setup TM according
> to
> > > the
> > > > > > > > submitted
> > > > > > > > > > job)
> > > > > > > > > > > > or
> > > > > > > > > > > > > > spawned by the setup script (you can have a look
> at
> > > > > > > > > > > start-cluster.sh).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards,
> > > > > > > > > > > > > > Sihua Zhou
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 发自网易邮箱大师
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On 03/15/2018 10:14,Renjie Liu<
> > > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > > > > > > > > > Hi, Till:
> > > > > > > > > > > > > > In fact I'm asking how to deploy other components
> > > such
> > > > as
> > > > > > > > > > dispatcher,
> > > > > > > > > > > > > etc.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Till Rohrmann <[hidden email]> 于
> > 2018年3月15日周四
> > > > > > > 上午12:17写道:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > in the current master and release-1.5 branch
> flip-6
> > > is
> > > > > > > > activated
> > > > > > > > > by
> > > > > > > > > > > > > > default. If you want to turn it off you have to
> add
> > > > > `mode:
> > > > > > > old`
> > > > > > > > > to
> > > > > > > > > > > your
> > > > > > > > > > > > > > flink-conf.yaml. I'm really happy that you want
> to
> > > test
> > > > > it
> > > > > > > out
> > > > > > > > > :-)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > Till
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 3:03 PM, Renjie Liu <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Till:
> > > > > > > > > > > > > > Is there any doc on deploying flink in flip6
> mode?
> > We
> > > > > want
> > > > > > to
> > > > > > > > > help
> > > > > > > > > > > > > > testing
> > > > > > > > > > > > > > it.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Till Rohrmann <[hidden email]> 于
> > 2018年3月14日周三
> > > > > > > 下午7:08写道:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > in order to make Mesos work, we only needed to
> > > > implement
> > > > > a
> > > > > > > > Mesos
> > > > > > > > > > > > > > specific
> > > > > > > > > > > > > > ResourceManager. Look at MesosResourceManager for
> > > more
> > > > > > > details.
> > > > > > > > > As
> > > > > > > > > > > > > > dispatcher, we use the StandaloneDispatcher which
> > is
> > > > > > spawned
> > > > > > > by
> > > > > > > > > > > > > > the MesosSessionClusterEntrypoint.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > Till
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi all:
> > > > > > > > > > > > > > I'm reading the source code and it seems that
> flip6
> > > > does
> > > > > > not
> > > > > > > > > > support
> > > > > > > > > > > > > > mesos?
> > > > > > > > > > > > > > According to the design, client send job graph to
> > > > > > dispatcher
> > > > > > > > and
> > > > > > > > > > > > > > dispatcher
> > > > > > > > > > > > > > spawn job mananger and resource manager for job
> > > > > execution.
> > > > > > > But
> > > > > > > > I
> > > > > > > > > > > > > > can't
> > > > > > > > > > > > > > find
> > > > > > > > > > > > > > dispatcher implementation for mesos.
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > "So you have to trust that the dots will somehow
> connect
> > in
> > > > > your
> > > > > > > > > future."
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > "So you have to trust that the dots will somehow connect in
> > > your
> > > > > > > future."
> > > > > > > > >
> > > > > > > >
> > > > > > > --
> > > > > > > Liu, Renjie
> > > > > > > Software Engineer, MVAD
> > > > > > >
> > > > > >
> > > > > --
> > > > > Liu, Renjie
> > > > > Software Engineer, MVAD
> > > > >
> > > >
> > > --
> > > Liu, Renjie
> > > Software Engineer, MVAD
> > >
> >
> --
> Liu, Renjie
> Software Engineer, MVAD
>
Reply | Threaded
Open this post in threaded view
|

Re: Flip 6 mesos support

Renjie Liu
Hi, Till:
Has anybody succeeded to deploy flip 6 mode on mesos?

I'm testing flip 6 using the master branch and I just can't run jobs. The
following are my configurations:

*jobmanager.rpc.address: qt9ss.prod.mediav.com
<http://qt9ss.prod.mediav.com>*
*jobmanager.rpc.port: 6123*
*jobmanager.heap.mb: 1024*
*taskmanager.heap.mb: 1024*
*taskmanager.numberOfTaskSlots: 5*
*parallelism.default: 1*
*web.port: 8081*
*mesos.master: zk://dk71ss.jx.shbt2.qihoo.net:2191
<http://dk71ss.jx.shbt2.qihoo.net:2191>,dk72ss.jx.shbt2.qihoo.net:2191
<http://dk72ss.jx.shbt2.qihoo.net:2191>,dk5ss.jx.shbt2.qihoo.net:2191/mesos
<http://dk5ss.jx.shbt2.qihoo.net:2191/mesos>*
*mesos.resourcemanager.tasks.container.type: docker*
*mesos.resourcemanager.tasks.container.image.name
<http://mesos.resourcemanager.tasks.container.image.name>:
dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT
<http://dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT>*
*mesos.resourcemanager.framework.user: mediav*
*mesos.resourcemanager.tasks.cpus: 5*
*mesos.resourcemanager.tasks.mem: 10240*
*mesos.resourcemanager.framework.name
<http://mesos.resourcemanager.framework.name>: Flink*
*mesos.failover-timeout: 60*

From the mesos side, I can see that when I submit a job, flink master will
request a contianer with 5 cores. But the job submission still fails the
following error:
*org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException:
Could not allocate all requires slots within timeout of 300000 ms. Slots
required: 1, slots allocated: 0*

My job only requires 1 slot but job manager keeps reporting that no slots
avaiable.

On Wed, Mar 21, 2018 at 10:42 PM Till Rohrmann <[hidden email]> wrote:

> The resources consumed by the JobMaster can be specified by
> `jobmanager.heap.mb`.
>
> Cheers,
> Till
>
> On Wed, Mar 21, 2018 at 3:20 PM, Renjie Liu <[hidden email]>
> wrote:
>
> > Hi, Till:
> >
> > In fact, I want to ask the resources consume by job manager
> >
> > Till Rohrmann <[hidden email]> 于 2018年3月21日周三 下午8:17写道:
> >
> > > As many as the application needs to run. If you start a job with
> > > parallelism 10 then it will ask for 10 slots (assuming slot sharing).
> > >
> > > On Wed, Mar 21, 2018 at 12:04 PM, Renjie Liu <[hidden email]>
> > > wrote:
> > >
> > > > So how many slots a job manager may consume?
> > > >
> > > > On Wed, Mar 21, 2018 at 6:50 PM Till Rohrmann <[hidden email]>
> > > > wrote:
> > > >
> > > > > At the moment this is not possible. In order to do this, you will
> > have
> > > to
> > > > > use the per-job mode and run each job on a dedicated Flink cluster.
> > > > >
> > > > > On Wed, Mar 21, 2018 at 11:33 AM, Renjie Liu <
> > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > For example, we have 2 jobs.
> > > > > > For job 1, I want to start job manger with 1 CPU and 100M memory.
> > > Job 1
> > > > > > need s10 slots, and I want to deploy these 10 slots in 2 task
> > > managers,
> > > > > > each with 5 cores and 1G memory.
> > > > > >
> > > > > > For job 2, I want to start job manager with 2 CPU and 200M
> memory.
> > > Job
> > > > 2
> > > > > > needs 100 slots and I want to deploy these 100 slot in 10 task
> > > > managers,
> > > > > > each with 10 cores and 2G memory.
> > > > > >
> > > > > > Is this possible?
> > > > > >
> > > > > > On Wed, Mar 21, 2018 at 6:19 PM Till Rohrmann <
> > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Renjie,
> > > > > > >
> > > > > > > what do you mean with specifying different JM and TM resources
> > for
> > > > > > > different jobs exactly?
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > > On Wed, Mar 21, 2018 at 10:55 AM, Renjie Liu <
> > > > [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi, Till:
> > > > > > > >
> > > > > > > > How to specify job manager and task manager resources for
> > > different
> > > > > > jobs
> > > > > > > in
> > > > > > > > session mode?
> > > > > > > >
> > > > > > > > On Sun, Mar 18, 2018 at 1:10 AM Till Rohrmann <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Shuyi,
> > > > > > > > >
> > > > > > > > > best if you look at the other e2e tests in the
> > > > > flink-end-to-end-tests
> > > > > > > > > module. For example the Kafka e2e test under
> > > > > > > > > flink/flink-end-to-end-tests/test-scripts/test_streaming_
> > > > > > kafka010.sh.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Till
> > > > > > > > >
> > > > > > > > > On Fri, Mar 16, 2018 at 10:20 PM, Shuyi Chen <
> > > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Till,
> > > > > > > > > >
> > > > > > > > > > For FLINK-8562, the test is passing now because it's not
> > > really
> > > > > > > > > > checking the right thing.
> > > > > > > > > >
> > > > > > > > > > Yes, I can help with the Kerberos integration ticket.
> > > > > > > > > >
> > > > > > > > > > Is there an example on how the e2e test should be
> > structured
> > > > and
> > > > > > > > invoked?
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > > Shuyi
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 16, 2018 at 6:51 AM, Till Rohrmann <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Shuyi,
> > > > > > > > > > >
> > > > > > > > > > > thanks for the working on FLINK-8562. Once this issue
> is
> > > > fixed,
> > > > > > it
> > > > > > > > will
> > > > > > > > > > > automatically be executed on the Flip-6 components. In
> > fact
> > > > it
> > > > > is
> > > > > > > > > already
> > > > > > > > > > > being executed on Flip-6.
> > > > > > > > > > >
> > > > > > > > > > > But what you could help the community with is setting
> up
> > an
> > > > > > > automated
> > > > > > > > > > > end-to-end test for the Kerberos integration if you
> want:
> > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8981.
> > > > > > > > > > >
> > > > > > > > > > > The Flink community is currently working on automating
> > more
> > > > and
> > > > > > > more
> > > > > > > > > > tests
> > > > > > > > > > > in order to facilitate faster releases and improve the
> > test
> > > > > > > coverage.
> > > > > > > > > You
> > > > > > > > > > > can find more about this effort here:
> > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8970.
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Till
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 15, 2018 at 8:45 PM, Shuyi Chen <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Till,
> > > > > > > > > > > >
> > > > > > > > > > > > This is Shuyi :) Thanks a lot. In FLINK-8562, I
> already
> > > > sent
> > > > > a
> > > > > > PR
> > > > > > > > to
> > > > > > > > > > > > resolve the issue, your help to take a look will be
> > > great.
> > > > > > > > > > > >
> > > > > > > > > > > > Please let me know what I can help to test the
> Kerberos
> > > > > > > > > > authentication, I
> > > > > > > > > > > > am decently familiar with the Kerberos and YARN
> > security
> > > > part
> > > > > > in
> > > > > > > > > Flink.
> > > > > > > > > > > >
> > > > > > > > > > > > As a starting point, I'd suggest to add an
> integration
> > > test
> > > > > > > similar
> > > > > > > > > to
> > > > > > > > > > > > YARNSessionFIFOSecuredITCase
> > > > > > > > > > > > for flip6.
> > > > > > > > > > > >
> > > > > > > > > > > > Shuyi
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 15, 2018 at 5:44 AM, Till Rohrmann <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > >
> > > > > > > > > > > > > thanks for the pointer with the
> > > > > YARNSessionFIFOSecuredITCase.
> > > > > > > > > You're
> > > > > > > > > > > > right
> > > > > > > > > > > > > that we should fix this test. There is FLINK-8562
> > which
> > > > > seems
> > > > > > > to
> > > > > > > > > > > address
> > > > > > > > > > > > > the problem. Will take a look.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Additionally, we want to test Kerberos
> authentication
> > > > > > > explicitly
> > > > > > > > as
> > > > > > > > > > > part
> > > > > > > > > > > > of
> > > > > > > > > > > > > the release testing for Flink 1.5. I will shortly
> > send
> > > > > > around a
> > > > > > > > > mail
> > > > > > > > > > > > where
> > > > > > > > > > > > > I will lay out the ongoing testing efforts and
> where
> > > more
> > > > > is
> > > > > > > > > needed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > Till
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Mar 15, 2018 at 7:37 AM, Renjie Liu <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for the clarification
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 2:30 PM 周思华 <
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > > if I am not misunderstand, you just need to
> start
> > > the
> > > > > > > cluster
> > > > > > > > > as
> > > > > > > > > > > > normal
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > before. The dispatcher and resourcemanager are
> > > > spawned
> > > > > by
> > > > > > > > > > > > > > ClusterEntryPoint
> > > > > > > > > > > > > > > (you can have a look at yarn-session.sh &
> > > > > > > > FlinkYarnSessionCli &
> > > > > > > > > > > > > > > YarnSessionClusterEntrypoint), and the TM are
> > > spawned
> > > > > by
> > > > > > > > > > > > > ResourceManager
> > > > > > > > > > > > > > > lazily (ResourceManager will setup TM according
> > to
> > > > the
> > > > > > > > > submitted
> > > > > > > > > > > job)
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > spawned by the setup script (you can have a
> look
> > at
> > > > > > > > > > > > start-cluster.sh).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best Regards,
> > > > > > > > > > > > > > > Sihua Zhou
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 发自网易邮箱大师
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On 03/15/2018 10:14,Renjie Liu<
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > > > > > Hi, Till:
> > > > > > > > > > > > > > > In fact I'm asking how to deploy other
> components
> > > > such
> > > > > as
> > > > > > > > > > > dispatcher,
> > > > > > > > > > > > > > etc.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Till Rohrmann <[hidden email]> 于
> > > 2018年3月15日周四
> > > > > > > > 上午12:17写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > in the current master and release-1.5 branch
> > flip-6
> > > > is
> > > > > > > > > activated
> > > > > > > > > > by
> > > > > > > > > > > > > > > default. If you want to turn it off you have to
> > add
> > > > > > `mode:
> > > > > > > > old`
> > > > > > > > > > to
> > > > > > > > > > > > your
> > > > > > > > > > > > > > > flink-conf.yaml. I'm really happy that you want
> > to
> > > > test
> > > > > > it
> > > > > > > > out
> > > > > > > > > > :-)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 3:03 PM, Renjie Liu <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Till:
> > > > > > > > > > > > > > > Is there any doc on deploying flink in flip6
> > mode?
> > > We
> > > > > > want
> > > > > > > to
> > > > > > > > > > help
> > > > > > > > > > > > > > > testing
> > > > > > > > > > > > > > > it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Till Rohrmann <[hidden email]> 于
> > > 2018年3月14日周三
> > > > > > > > 下午7:08写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > in order to make Mesos work, we only needed to
> > > > > implement
> > > > > > a
> > > > > > > > > Mesos
> > > > > > > > > > > > > > > specific
> > > > > > > > > > > > > > > ResourceManager. Look at MesosResourceManager
> for
> > > > more
> > > > > > > > details.
> > > > > > > > > > As
> > > > > > > > > > > > > > > dispatcher, we use the StandaloneDispatcher
> which
> > > is
> > > > > > > spawned
> > > > > > > > by
> > > > > > > > > > > > > > > the MesosSessionClusterEntrypoint.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi all:
> > > > > > > > > > > > > > > I'm reading the source code and it seems that
> > flip6
> > > > > does
> > > > > > > not
> > > > > > > > > > > support
> > > > > > > > > > > > > > > mesos?
> > > > > > > > > > > > > > > According to the design, client send job graph
> to
> > > > > > > dispatcher
> > > > > > > > > and
> > > > > > > > > > > > > > > dispatcher
> > > > > > > > > > > > > > > spawn job mananger and resource manager for job
> > > > > > execution.
> > > > > > > > But
> > > > > > > > > I
> > > > > > > > > > > > > > > can't
> > > > > > > > > > > > > > > find
> > > > > > > > > > > > > > > dispatcher implementation for mesos.
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > "So you have to trust that the dots will somehow
> > connect
> > > in
> > > > > > your
> > > > > > > > > > future."
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > "So you have to trust that the dots will somehow connect
> in
> > > > your
> > > > > > > > future."
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > --
> > > > > > > > Liu, Renjie
> > > > > > > > Software Engineer, MVAD
> > > > > > > >
> > > > > > >
> > > > > > --
> > > > > > Liu, Renjie
> > > > > > Software Engineer, MVAD
> > > > > >
> > > > >
> > > > --
> > > > Liu, Renjie
> > > > Software Engineer, MVAD
> > > >
> > >
> > --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
>
--
Liu, Renjie
Software Engineer, MVAD
Reply | Threaded
Open this post in threaded view
|

Re: Flip 6 mesos support

Till Rohrmann
HI Renjie, could you share the logs with us? This sounds like a bug we
should fix.

Cheers,
Till

On Fri, Mar 23, 2018 at 4:42 AM, Renjie Liu <[hidden email]> wrote:

> Hi, Till:
> Has anybody succeeded to deploy flip 6 mode on mesos?
>
> I'm testing flip 6 using the master branch and I just can't run jobs. The
> following are my configurations:
>
> *jobmanager.rpc.address: qt9ss.prod.mediav.com
> <http://qt9ss.prod.mediav.com>*
> *jobmanager.rpc.port: 6123*
> *jobmanager.heap.mb: 1024*
> *taskmanager.heap.mb: 1024*
> *taskmanager.numberOfTaskSlots: 5*
> *parallelism.default: 1*
> *web.port: 8081*
> *mesos.master: zk://dk71ss.jx.shbt2.qihoo.net:2191
> <http://dk71ss.jx.shbt2.qihoo.net:2191>,dk72ss.jx.shbt2.qihoo.net:2191
> <http://dk72ss.jx.shbt2.qihoo.net:2191>,dk5ss.jx.shbt2.
> qihoo.net:2191/mesos
> <http://dk5ss.jx.shbt2.qihoo.net:2191/mesos>*
> *mesos.resourcemanager.tasks.container.type: docker*
> *mesos.resourcemanager.tasks.container.image.name
> <http://mesos.resourcemanager.tasks.container.image.name>:
> dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT
> <http://dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT>*
> *mesos.resourcemanager.framework.user: mediav*
> *mesos.resourcemanager.tasks.cpus: 5*
> *mesos.resourcemanager.tasks.mem: 10240*
> *mesos.resourcemanager.framework.name
> <http://mesos.resourcemanager.framework.name>: Flink*
> *mesos.failover-timeout: 60*
>
> From the mesos side, I can see that when I submit a job, flink master will
> request a contianer with 5 cores. But the job submission still fails the
> following error:
> *org.apache.flink.runtime.jobmanager.scheduler.
> NoResourceAvailableException:
> Could not allocate all requires slots within timeout of 300000 ms. Slots
> required: 1, slots allocated: 0*
>
> My job only requires 1 slot but job manager keeps reporting that no slots
> avaiable.
>
> On Wed, Mar 21, 2018 at 10:42 PM Till Rohrmann <[hidden email]>
> wrote:
>
> > The resources consumed by the JobMaster can be specified by
> > `jobmanager.heap.mb`.
> >
> > Cheers,
> > Till
> >
> > On Wed, Mar 21, 2018 at 3:20 PM, Renjie Liu <[hidden email]>
> > wrote:
> >
> > > Hi, Till:
> > >
> > > In fact, I want to ask the resources consume by job manager
> > >
> > > Till Rohrmann <[hidden email]> 于 2018年3月21日周三 下午8:17写道:
> > >
> > > > As many as the application needs to run. If you start a job with
> > > > parallelism 10 then it will ask for 10 slots (assuming slot sharing).
> > > >
> > > > On Wed, Mar 21, 2018 at 12:04 PM, Renjie Liu <
> [hidden email]>
> > > > wrote:
> > > >
> > > > > So how many slots a job manager may consume?
> > > > >
> > > > > On Wed, Mar 21, 2018 at 6:50 PM Till Rohrmann <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > At the moment this is not possible. In order to do this, you will
> > > have
> > > > to
> > > > > > use the per-job mode and run each job on a dedicated Flink
> cluster.
> > > > > >
> > > > > > On Wed, Mar 21, 2018 at 11:33 AM, Renjie Liu <
> > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > For example, we have 2 jobs.
> > > > > > > For job 1, I want to start job manger with 1 CPU and 100M
> memory.
> > > > Job 1
> > > > > > > need s10 slots, and I want to deploy these 10 slots in 2 task
> > > > managers,
> > > > > > > each with 5 cores and 1G memory.
> > > > > > >
> > > > > > > For job 2, I want to start job manager with 2 CPU and 200M
> > memory.
> > > > Job
> > > > > 2
> > > > > > > needs 100 slots and I want to deploy these 100 slot in 10 task
> > > > > managers,
> > > > > > > each with 10 cores and 2G memory.
> > > > > > >
> > > > > > > Is this possible?
> > > > > > >
> > > > > > > On Wed, Mar 21, 2018 at 6:19 PM Till Rohrmann <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Renjie,
> > > > > > > >
> > > > > > > > what do you mean with specifying different JM and TM
> resources
> > > for
> > > > > > > > different jobs exactly?
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Till
> > > > > > > >
> > > > > > > > On Wed, Mar 21, 2018 at 10:55 AM, Renjie Liu <
> > > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Till:
> > > > > > > > >
> > > > > > > > > How to specify job manager and task manager resources for
> > > > different
> > > > > > > jobs
> > > > > > > > in
> > > > > > > > > session mode?
> > > > > > > > >
> > > > > > > > > On Sun, Mar 18, 2018 at 1:10 AM Till Rohrmann <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Shuyi,
> > > > > > > > > >
> > > > > > > > > > best if you look at the other e2e tests in the
> > > > > > flink-end-to-end-tests
> > > > > > > > > > module. For example the Kafka e2e test under
> > > > > > > > > > flink/flink-end-to-end-tests/
> test-scripts/test_streaming_
> > > > > > > kafka010.sh.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Till
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 16, 2018 at 10:20 PM, Shuyi Chen <
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Till,
> > > > > > > > > > >
> > > > > > > > > > > For FLINK-8562, the test is passing now because it's
> not
> > > > really
> > > > > > > > > > > checking the right thing.
> > > > > > > > > > >
> > > > > > > > > > > Yes, I can help with the Kerberos integration ticket.
> > > > > > > > > > >
> > > > > > > > > > > Is there an example on how the e2e test should be
> > > structured
> > > > > and
> > > > > > > > > invoked?
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > > Shuyi
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 16, 2018 at 6:51 AM, Till Rohrmann <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Shuyi,
> > > > > > > > > > > >
> > > > > > > > > > > > thanks for the working on FLINK-8562. Once this issue
> > is
> > > > > fixed,
> > > > > > > it
> > > > > > > > > will
> > > > > > > > > > > > automatically be executed on the Flip-6 components.
> In
> > > fact
> > > > > it
> > > > > > is
> > > > > > > > > > already
> > > > > > > > > > > > being executed on Flip-6.
> > > > > > > > > > > >
> > > > > > > > > > > > But what you could help the community with is setting
> > up
> > > an
> > > > > > > > automated
> > > > > > > > > > > > end-to-end test for the Kerberos integration if you
> > want:
> > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8981.
> > > > > > > > > > > >
> > > > > > > > > > > > The Flink community is currently working on
> automating
> > > more
> > > > > and
> > > > > > > > more
> > > > > > > > > > > tests
> > > > > > > > > > > > in order to facilitate faster releases and improve
> the
> > > test
> > > > > > > > coverage.
> > > > > > > > > > You
> > > > > > > > > > > > can find more about this effort here:
> > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8970.
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Till
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 15, 2018 at 8:45 PM, Shuyi Chen <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Till,
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is Shuyi :) Thanks a lot. In FLINK-8562, I
> > already
> > > > > sent
> > > > > > a
> > > > > > > PR
> > > > > > > > > to
> > > > > > > > > > > > > resolve the issue, your help to take a look will be
> > > > great.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please let me know what I can help to test the
> > Kerberos
> > > > > > > > > > > authentication, I
> > > > > > > > > > > > > am decently familiar with the Kerberos and YARN
> > > security
> > > > > part
> > > > > > > in
> > > > > > > > > > Flink.
> > > > > > > > > > > > >
> > > > > > > > > > > > > As a starting point, I'd suggest to add an
> > integration
> > > > test
> > > > > > > > similar
> > > > > > > > > > to
> > > > > > > > > > > > > YARNSessionFIFOSecuredITCase
> > > > > > > > > > > > > for flip6.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Shuyi
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Mar 15, 2018 at 5:44 AM, Till Rohrmann <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > thanks for the pointer with the
> > > > > > YARNSessionFIFOSecuredITCase.
> > > > > > > > > > You're
> > > > > > > > > > > > > right
> > > > > > > > > > > > > > that we should fix this test. There is FLINK-8562
> > > which
> > > > > > seems
> > > > > > > > to
> > > > > > > > > > > > address
> > > > > > > > > > > > > > the problem. Will take a look.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Additionally, we want to test Kerberos
> > authentication
> > > > > > > > explicitly
> > > > > > > > > as
> > > > > > > > > > > > part
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > the release testing for Flink 1.5. I will shortly
> > > send
> > > > > > > around a
> > > > > > > > > > mail
> > > > > > > > > > > > > where
> > > > > > > > > > > > > > I will lay out the ongoing testing efforts and
> > where
> > > > more
> > > > > > is
> > > > > > > > > > needed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > Till
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 7:37 AM, Renjie Liu <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the clarification
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 2:30 PM 周思华 <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > > > if I am not misunderstand, you just need to
> > start
> > > > the
> > > > > > > > cluster
> > > > > > > > > > as
> > > > > > > > > > > > > normal
> > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > before. The dispatcher and resourcemanager
> are
> > > > > spawned
> > > > > > by
> > > > > > > > > > > > > > > ClusterEntryPoint
> > > > > > > > > > > > > > > > (you can have a look at yarn-session.sh &
> > > > > > > > > FlinkYarnSessionCli &
> > > > > > > > > > > > > > > > YarnSessionClusterEntrypoint), and the TM are
> > > > spawned
> > > > > > by
> > > > > > > > > > > > > > ResourceManager
> > > > > > > > > > > > > > > > lazily (ResourceManager will setup TM
> according
> > > to
> > > > > the
> > > > > > > > > > submitted
> > > > > > > > > > > > job)
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > spawned by the setup script (you can have a
> > look
> > > at
> > > > > > > > > > > > > start-cluster.sh).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best Regards,
> > > > > > > > > > > > > > > > Sihua Zhou
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 发自网易邮箱大师
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On 03/15/2018 10:14,Renjie Liu<
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > > > > Hi, Till:
> > > > > > > > > > > > > > > > In fact I'm asking how to deploy other
> > components
> > > > > such
> > > > > > as
> > > > > > > > > > > > dispatcher,
> > > > > > > > > > > > > > > etc.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Till Rohrmann <[hidden email]> 于
> > > > 2018年3月15日周四
> > > > > > > > > 上午12:17写道:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > in the current master and release-1.5 branch
> > > flip-6
> > > > > is
> > > > > > > > > > activated
> > > > > > > > > > > by
> > > > > > > > > > > > > > > > default. If you want to turn it off you have
> to
> > > add
> > > > > > > `mode:
> > > > > > > > > old`
> > > > > > > > > > > to
> > > > > > > > > > > > > your
> > > > > > > > > > > > > > > > flink-conf.yaml. I'm really happy that you
> want
> > > to
> > > > > test
> > > > > > > it
> > > > > > > > > out
> > > > > > > > > > > :-)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 3:03 PM, Renjie Liu <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Till:
> > > > > > > > > > > > > > > > Is there any doc on deploying flink in flip6
> > > mode?
> > > > We
> > > > > > > want
> > > > > > > > to
> > > > > > > > > > > help
> > > > > > > > > > > > > > > > testing
> > > > > > > > > > > > > > > > it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Till Rohrmann <[hidden email]> 于
> > > > 2018年3月14日周三
> > > > > > > > > 下午7:08写道:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > in order to make Mesos work, we only needed
> to
> > > > > > implement
> > > > > > > a
> > > > > > > > > > Mesos
> > > > > > > > > > > > > > > > specific
> > > > > > > > > > > > > > > > ResourceManager. Look at MesosResourceManager
> > for
> > > > > more
> > > > > > > > > details.
> > > > > > > > > > > As
> > > > > > > > > > > > > > > > dispatcher, we use the StandaloneDispatcher
> > which
> > > > is
> > > > > > > > spawned
> > > > > > > > > by
> > > > > > > > > > > > > > > > the MesosSessionClusterEntrypoint.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi all:
> > > > > > > > > > > > > > > > I'm reading the source code and it seems that
> > > flip6
> > > > > > does
> > > > > > > > not
> > > > > > > > > > > > support
> > > > > > > > > > > > > > > > mesos?
> > > > > > > > > > > > > > > > According to the design, client send job
> graph
> > to
> > > > > > > > dispatcher
> > > > > > > > > > and
> > > > > > > > > > > > > > > > dispatcher
> > > > > > > > > > > > > > > > spawn job mananger and resource manager for
> job
> > > > > > > execution.
> > > > > > > > > But
> > > > > > > > > > I
> > > > > > > > > > > > > > > > can't
> > > > > > > > > > > > > > > > find
> > > > > > > > > > > > > > > > dispatcher implementation for mesos.
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > "So you have to trust that the dots will somehow
> > > connect
> > > > in
> > > > > > > your
> > > > > > > > > > > future."
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > "So you have to trust that the dots will somehow
> connect
> > in
> > > > > your
> > > > > > > > > future."
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Liu, Renjie
> > > > > > > > > Software Engineer, MVAD
> > > > > > > > >
> > > > > > > >
> > > > > > > --
> > > > > > > Liu, Renjie
> > > > > > > Software Engineer, MVAD
> > > > > > >
> > > > > >
> > > > > --
> > > > > Liu, Renjie
> > > > > Software Engineer, MVAD
> > > > >
> > > >
> > > --
> > > Liu, Renjie
> > > Software Engineer, MVAD
> > >
> >
> --
> Liu, Renjie
> Software Engineer, MVAD
>
Reply | Threaded
Open this post in threaded view
|

Re: Flip 6 mesos support

Renjie Liu
Hi, Till:
Attached is my log.

I'm also looking into this, could you please assign this bug to me? I'm also trying to contribute to flink.

On Fri, Mar 23, 2018 at 4:11 PM Till Rohrmann <[hidden email]> wrote:
HI Renjie, could you share the logs with us? This sounds like a bug we
should fix.

Cheers,
Till

On Fri, Mar 23, 2018 at 4:42 AM, Renjie Liu <[hidden email]> wrote:

> Hi, Till:
> Has anybody succeeded to deploy flip 6 mode on mesos?
>
> I'm testing flip 6 using the master branch and I just can't run jobs. The
> following are my configurations:
>
> *jobmanager.rpc.address: qt9ss.prod.mediav.com
> <http://qt9ss.prod.mediav.com>*
> *jobmanager.rpc.port: 6123*
> *jobmanager.heap.mb: 1024*
> *taskmanager.heap.mb: 1024*
> *taskmanager.numberOfTaskSlots: 5*
> *parallelism.default: 1*
> *web.port: 8081*
> *mesos.master: zk://dk71ss.jx.shbt2.qihoo.net:2191
> <http://dk71ss.jx.shbt2.qihoo.net:2191>,dk72ss.jx.shbt2.qihoo.net:2191
> <http://dk72ss.jx.shbt2.qihoo.net:2191>,dk5ss.jx.shbt2.
> qihoo.net:2191/mesos
> <http://dk5ss.jx.shbt2.qihoo.net:2191/mesos>*
> *mesos.resourcemanager.tasks.container.type: docker*
> *mesos.resourcemanager.tasks.container.image.name
> <http://mesos.resourcemanager.tasks.container.image.name>:
> dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT
> <http://dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT>*
> *mesos.resourcemanager.framework.user: mediav*
> *mesos.resourcemanager.tasks.cpus: 5*
> *mesos.resourcemanager.tasks.mem: 10240*
> *mesos.resourcemanager.framework.name
> <http://mesos.resourcemanager.framework.name>: Flink*
> *mesos.failover-timeout: 60*
>
> From the mesos side, I can see that when I submit a job, flink master will
> request a contianer with 5 cores. But the job submission still fails the
> following error:
> *org.apache.flink.runtime.jobmanager.scheduler.
> NoResourceAvailableException:
> Could not allocate all requires slots within timeout of 300000 ms. Slots
> required: 1, slots allocated: 0*
>
> My job only requires 1 slot but job manager keeps reporting that no slots
> avaiable.
>
> On Wed, Mar 21, 2018 at 10:42 PM Till Rohrmann <[hidden email]>
> wrote:
>
> > The resources consumed by the JobMaster can be specified by
> > `jobmanager.heap.mb`.
> >
> > Cheers,
> > Till
> >
> > On Wed, Mar 21, 2018 at 3:20 PM, Renjie Liu <[hidden email]>
> > wrote:
> >
> > > Hi, Till:
> > >
> > > In fact, I want to ask the resources consume by job manager
> > >
> > > Till Rohrmann <[hidden email]> 于 2018年3月21日周三 下午8:17写道:
> > >
> > > > As many as the application needs to run. If you start a job with
> > > > parallelism 10 then it will ask for 10 slots (assuming slot sharing).
> > > >
> > > > On Wed, Mar 21, 2018 at 12:04 PM, Renjie Liu <
> [hidden email]>
> > > > wrote:
> > > >
> > > > > So how many slots a job manager may consume?
> > > > >
> > > > > On Wed, Mar 21, 2018 at 6:50 PM Till Rohrmann <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > At the moment this is not possible. In order to do this, you will
> > > have
> > > > to
> > > > > > use the per-job mode and run each job on a dedicated Flink
> cluster.
> > > > > >
> > > > > > On Wed, Mar 21, 2018 at 11:33 AM, Renjie Liu <
> > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > For example, we have 2 jobs.
> > > > > > > For job 1, I want to start job manger with 1 CPU and 100M
> memory.
> > > > Job 1
> > > > > > > need s10 slots, and I want to deploy these 10 slots in 2 task
> > > > managers,
> > > > > > > each with 5 cores and 1G memory.
> > > > > > >
> > > > > > > For job 2, I want to start job manager with 2 CPU and 200M
> > memory.
> > > > Job
> > > > > 2
> > > > > > > needs 100 slots and I want to deploy these 100 slot in 10 task
> > > > > managers,
> > > > > > > each with 10 cores and 2G memory.
> > > > > > >
> > > > > > > Is this possible?
> > > > > > >
> > > > > > > On Wed, Mar 21, 2018 at 6:19 PM Till Rohrmann <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Renjie,
> > > > > > > >
> > > > > > > > what do you mean with specifying different JM and TM
> resources
> > > for
> > > > > > > > different jobs exactly?
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Till
> > > > > > > >
> > > > > > > > On Wed, Mar 21, 2018 at 10:55 AM, Renjie Liu <
> > > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Till:
> > > > > > > > >
> > > > > > > > > How to specify job manager and task manager resources for
> > > > different
> > > > > > > jobs
> > > > > > > > in
> > > > > > > > > session mode?
> > > > > > > > >
> > > > > > > > > On Sun, Mar 18, 2018 at 1:10 AM Till Rohrmann <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Shuyi,
> > > > > > > > > >
> > > > > > > > > > best if you look at the other e2e tests in the
> > > > > > flink-end-to-end-tests
> > > > > > > > > > module. For example the Kafka e2e test under
> > > > > > > > > > flink/flink-end-to-end-tests/
> test-scripts/test_streaming_
> > > > > > > kafka010.sh.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Till
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 16, 2018 at 10:20 PM, Shuyi Chen <
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Till,
> > > > > > > > > > >
> > > > > > > > > > > For FLINK-8562, the test is passing now because it's
> not
> > > > really
> > > > > > > > > > > checking the right thing.
> > > > > > > > > > >
> > > > > > > > > > > Yes, I can help with the Kerberos integration ticket.
> > > > > > > > > > >
> > > > > > > > > > > Is there an example on how the e2e test should be
> > > structured
> > > > > and
> > > > > > > > > invoked?
> > > > > > > > > > >
> > > > > > > > > > > Thanks
> > > > > > > > > > > Shuyi
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 16, 2018 at 6:51 AM, Till Rohrmann <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Shuyi,
> > > > > > > > > > > >
> > > > > > > > > > > > thanks for the working on FLINK-8562. Once this issue
> > is
> > > > > fixed,
> > > > > > > it
> > > > > > > > > will
> > > > > > > > > > > > automatically be executed on the Flip-6 components.
> In
> > > fact
> > > > > it
> > > > > > is
> > > > > > > > > > already
> > > > > > > > > > > > being executed on Flip-6.
> > > > > > > > > > > >
> > > > > > > > > > > > But what you could help the community with is setting
> > up
> > > an
> > > > > > > > automated
> > > > > > > > > > > > end-to-end test for the Kerberos integration if you
> > want:
> > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8981.
> > > > > > > > > > > >
> > > > > > > > > > > > The Flink community is currently working on
> automating
> > > more
> > > > > and
> > > > > > > > more
> > > > > > > > > > > tests
> > > > > > > > > > > > in order to facilitate faster releases and improve
> the
> > > test
> > > > > > > > coverage.
> > > > > > > > > > You
> > > > > > > > > > > > can find more about this effort here:
> > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8970.
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Till
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 15, 2018 at 8:45 PM, Shuyi Chen <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Till,
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is Shuyi :) Thanks a lot. In FLINK-8562, I
> > already
> > > > > sent
> > > > > > a
> > > > > > > PR
> > > > > > > > > to
> > > > > > > > > > > > > resolve the issue, your help to take a look will be
> > > > great.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please let me know what I can help to test the
> > Kerberos
> > > > > > > > > > > authentication, I
> > > > > > > > > > > > > am decently familiar with the Kerberos and YARN
> > > security
> > > > > part
> > > > > > > in
> > > > > > > > > > Flink.
> > > > > > > > > > > > >
> > > > > > > > > > > > > As a starting point, I'd suggest to add an
> > integration
> > > > test
> > > > > > > > similar
> > > > > > > > > > to
> > > > > > > > > > > > > YARNSessionFIFOSecuredITCase
> > > > > > > > > > > > > for flip6.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Shuyi
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Mar 15, 2018 at 5:44 AM, Till Rohrmann <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > thanks for the pointer with the
> > > > > > YARNSessionFIFOSecuredITCase.
> > > > > > > > > > You're
> > > > > > > > > > > > > right
> > > > > > > > > > > > > > that we should fix this test. There is FLINK-8562
> > > which
> > > > > > seems
> > > > > > > > to
> > > > > > > > > > > > address
> > > > > > > > > > > > > > the problem. Will take a look.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Additionally, we want to test Kerberos
> > authentication
> > > > > > > > explicitly
> > > > > > > > > as
> > > > > > > > > > > > part
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > the release testing for Flink 1.5. I will shortly
> > > send
> > > > > > > around a
> > > > > > > > > > mail
> > > > > > > > > > > > > where
> > > > > > > > > > > > > > I will lay out the ongoing testing efforts and
> > where
> > > > more
> > > > > > is
> > > > > > > > > > needed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > Till
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 7:37 AM, Renjie Liu <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the clarification
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 2:30 PM 周思华 <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > > > if I am not misunderstand, you just need to
> > start
> > > > the
> > > > > > > > cluster
> > > > > > > > > > as
> > > > > > > > > > > > > normal
> > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > before. The dispatcher and resourcemanager
> are
> > > > > spawned
> > > > > > by
> > > > > > > > > > > > > > > ClusterEntryPoint
> > > > > > > > > > > > > > > > (you can have a look at yarn-session.sh &
> > > > > > > > > FlinkYarnSessionCli &
> > > > > > > > > > > > > > > > YarnSessionClusterEntrypoint), and the TM are
> > > > spawned
> > > > > > by
> > > > > > > > > > > > > > ResourceManager
> > > > > > > > > > > > > > > > lazily (ResourceManager will setup TM
> according
> > > to
> > > > > the
> > > > > > > > > > submitted
> > > > > > > > > > > > job)
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > spawned by the setup script (you can have a
> > look
> > > at
> > > > > > > > > > > > > start-cluster.sh).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best Regards,
> > > > > > > > > > > > > > > > Sihua Zhou
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 发自网易邮箱大师
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On 03/15/2018 10:14,Renjie Liu<
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > > > > Hi, Till:
> > > > > > > > > > > > > > > > In fact I'm asking how to deploy other
> > components
> > > > > such
> > > > > > as
> > > > > > > > > > > > dispatcher,
> > > > > > > > > > > > > > > etc.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Till Rohrmann <[hidden email]> 于
> > > > 2018年3月15日周四
> > > > > > > > > 上午12:17写道:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > in the current master and release-1.5 branch
> > > flip-6
> > > > > is
> > > > > > > > > > activated
> > > > > > > > > > > by
> > > > > > > > > > > > > > > > default. If you want to turn it off you have
> to
> > > add
> > > > > > > `mode:
> > > > > > > > > old`
> > > > > > > > > > > to
> > > > > > > > > > > > > your
> > > > > > > > > > > > > > > > flink-conf.yaml. I'm really happy that you
> want
> > > to
> > > > > test
> > > > > > > it
> > > > > > > > > out
> > > > > > > > > > > :-)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 3:03 PM, Renjie Liu <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Till:
> > > > > > > > > > > > > > > > Is there any doc on deploying flink in flip6
> > > mode?
> > > > We
> > > > > > > want
> > > > > > > > to
> > > > > > > > > > > help
> > > > > > > > > > > > > > > > testing
> > > > > > > > > > > > > > > > it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Till Rohrmann <[hidden email]> 于
> > > > 2018年3月14日周三
> > > > > > > > > 下午7:08写道:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Renjie,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > in order to make Mesos work, we only needed
> to
> > > > > > implement
> > > > > > > a
> > > > > > > > > > Mesos
> > > > > > > > > > > > > > > > specific
> > > > > > > > > > > > > > > > ResourceManager. Look at MesosResourceManager
> > for
> > > > > more
> > > > > > > > > details.
> > > > > > > > > > > As
> > > > > > > > > > > > > > > > dispatcher, we use the StandaloneDispatcher
> > which
> > > > is
> > > > > > > > spawned
> > > > > > > > > by
> > > > > > > > > > > > > > > > the MesosSessionClusterEntrypoint.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Mar 14, 2018 at 9:32 AM, Renjie Liu <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi all:
> > > > > > > > > > > > > > > > I'm reading the source code and it seems that
> > > flip6
> > > > > > does
> > > > > > > > not
> > > > > > > > > > > > support
> > > > > > > > > > > > > > > > mesos?
> > > > > > > > > > > > > > > > According to the design, client send job
> graph
> > to
> > > > > > > > dispatcher
> > > > > > > > > > and
> > > > > > > > > > > > > > > > dispatcher
> > > > > > > > > > > > > > > > spawn job mananger and resource manager for
> job
> > > > > > > execution.
> > > > > > > > > But
> > > > > > > > > > I
> > > > > > > > > > > > > > > > can't
> > > > > > > > > > > > > > > > find
> > > > > > > > > > > > > > > > dispatcher implementation for mesos.
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Liu, Renjie
> > > > > > > > > > > > > > > Software Engineer, MVAD
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > "So you have to trust that the dots will somehow
> > > connect
> > > > in
> > > > > > > your
> > > > > > > > > > > future."
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > "So you have to trust that the dots will somehow
> connect
> > in
> > > > > your
> > > > > > > > > future."
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Liu, Renjie
> > > > > > > > > Software Engineer, MVAD
> > > > > > > > >
> > > > > > > >
> > > > > > > --
> > > > > > > Liu, Renjie
> > > > > > > Software Engineer, MVAD
> > > > > > >
> > > > > >
> > > > > --
> > > > > Liu, Renjie
> > > > > Software Engineer, MVAD
> > > > >
> > > >
> > > --
> > > Liu, Renjie
> > > Software Engineer, MVAD
> > >
> >
> --
> Liu, Renjie
> Software Engineer, MVAD
>
--
Liu, Renjie
Software Engineer, MVAD
Reply | Threaded
Open this post in threaded view
|

Re: Flip 6 mesos support

Ted Yu
Renjie:The log didn't go through.
Consider logging a JIRA and attach the log there. 
Thanks
-------- Original message --------From: Renjie Liu <[hidden email]> Date: 3/23/18  1:38 AM  (GMT-08:00) To: [hidden email] Subject: Re: Flip 6 mesos support
Hi, Till:Attached is my log.
I'm also looking into this, could you please assign this bug to me? I'm also trying to contribute to flink.

On Fri, Mar 23, 2018 at 4:11 PM Till Rohrmann <[hidden email]> wrote:
HI Renjie, could you share the logs with us? This sounds like a bug we

should fix.



Cheers,

Till



On Fri, Mar 23, 2018 at 4:42 AM, Renjie Liu <[hidden email]> wrote:



> Hi, Till:

> Has anybody succeeded to deploy flip 6 mode on mesos?

>

> I'm testing flip 6 using the master branch and I just can't run jobs. The

> following are my configurations:

>

> *jobmanager.rpc.address: qt9ss.prod.mediav.com

> <http://qt9ss.prod.mediav.com>*

> *jobmanager.rpc.port: 6123*

> *jobmanager.heap.mb: 1024*

> *taskmanager.heap.mb: 1024*

> *taskmanager.numberOfTaskSlots: 5*

> *parallelism.default: 1*

> *web.port: 8081*

> *mesos.master: zk://dk71ss.jx.shbt2.qihoo.net:2191

> <http://dk71ss.jx.shbt2.qihoo.net:2191>,dk72ss.jx.shbt2.qihoo.net:2191

> <http://dk72ss.jx.shbt2.qihoo.net:2191>,dk5ss.jx.shbt2.

> qihoo.net:2191/mesos

> <http://dk5ss.jx.shbt2.qihoo.net:2191/mesos>*

> *mesos.resourcemanager.tasks.container.type: docker*

> *mesos.resourcemanager.tasks.container.image.name

> <http://mesos.resourcemanager.tasks.container.image.name>:

> dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT

> <http://dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT>*

> *mesos.resourcemanager.framework.user: mediav*

> *mesos.resourcemanager.tasks.cpus: 5*

> *mesos.resourcemanager.tasks.mem: 10240*

> *mesos.resourcemanager.framework.name

> <http://mesos.resourcemanager.framework.name>: Flink*

> *mesos.failover-timeout: 60*

>

> From the mesos side, I can see that when I submit a job, flink master will

> request a contianer with 5 cores. But the job submission still fails the

> following error:

> *org.apache.flink.runtime.jobmanager.scheduler.

> NoResourceAvailableException:

> Could not allocate all requires slots within timeout of 300000 ms. Slots

> required: 1, slots allocated: 0*

>

> My job only requires 1 slot but job manager keeps reporting that no slots

> avaiable.

>

> On Wed, Mar 21, 2018 at 10:42 PM Till Rohrmann <[hidden email]>

> wrote:

>

> > The resources consumed by the JobMaster can be specified by

> > `jobmanager.heap.mb`.

> >

> > Cheers,

> > Till

> >

> > On Wed, Mar 21, 2018 at 3:20 PM, Renjie Liu <[hidden email]>

> > wrote:

> >

> > > Hi, Till:

> > >

> > > In fact, I want to ask the resources consume by job manager

> > >

> > > Till Rohrmann <[hidden email]> 于 2018年3月21日周三 下午8:17写道:

> > >

> > > > As many as the application needs to run. If you start a job with

> > > > parallelism 10 then it will ask for 10 slots (assuming slot sharing).

> > > >

> > > > On Wed, Mar 21, 2018 at 12:04 PM, Renjie Liu <

> [hidden email]>

> > > > wrote:

> > > >

> > > > > So how many slots a job manager may consume?

> > > > >

> > > > > On Wed, Mar 21, 2018 at 6:50 PM Till Rohrmann <

> [hidden email]>

> > > > > wrote:

> > > > >

> > > > > > At the moment this is not possible. In order to do this, you will

> > > have

> > > > to

> > > > > > use the per-job mode and run each job on a dedicated Flink

> cluster.

> > > > > >

> > > > > > On Wed, Mar 21, 2018 at 11:33 AM, Renjie Liu <

> > > [hidden email]>

> > > > > > wrote:

> > > > > >

> > > > > > > For example, we have 2 jobs.

> > > > > > > For job 1, I want to start job manger with 1 CPU and 100M

> memory.

> > > > Job 1

> > > > > > > need s10 slots, and I want to deploy these 10 slots in 2 task

> > > > managers,

> > > > > > > each with 5 cores and 1G memory.

> > > > > > >

> > > > > > > For job 2, I want to start job manager with 2 CPU and 200M

> > memory.

> > > > Job

> > > > > 2

> > > > > > > needs 100 slots and I want to deploy these 100 slot in 10 task

> > > > > managers,

> > > > > > > each with 10 cores and 2G memory.

> > > > > > >

> > > > > > > Is this possible?

> > > > > > >

> > > > > > > On Wed, Mar 21, 2018 at 6:19 PM Till Rohrmann <

> > > [hidden email]>

> > > > > > > wrote:

> > > > > > >

> > > > > > > > Hi Renjie,

> > > > > > > >

> > > > > > > > what do you mean with specifying different JM and TM

> resources

> > > for

> > > > > > > > different jobs exactly?

> > > > > > > >

> > > > > > > > Cheers,

> > > > > > > > Till

> > > > > > > >

> > > > > > > > On Wed, Mar 21, 2018 at 10:55 AM, Renjie Liu <

> > > > > [hidden email]>

> > > > > > > > wrote:

> > > > > > > >

> > > > > > > > > Hi, Till:

> > > > > > > > >

> > > > > > > > > How to specify job manager and task manager resources for

> > > > different

> > > > > > > jobs

> > > > > > > > in

> > > > > > > > > session mode?

> > > > > > > > >

> > > > > > > > > On Sun, Mar 18, 2018 at 1:10 AM Till Rohrmann <

> > > > > [hidden email]>

> > > > > > > > > wrote:

> > > > > > > > >

> > > > > > > > > > Hi Shuyi,

> > > > > > > > > >

> > > > > > > > > > best if you look at the other e2e tests in the

> > > > > > flink-end-to-end-tests

> > > > > > > > > > module. For example the Kafka e2e test under

> > > > > > > > > > flink/flink-end-to-end-tests/

> test-scripts/test_streaming_

> > > > > > > kafka010.sh.

> > > > > > > > > >

> > > > > > > > > > Cheers,

> > > > > > > > > > Till

> > > > > > > > > >

> > > > > > > > > > On Fri, Mar 16, 2018 at 10:20 PM, Shuyi Chen <

> > > > [hidden email]

> > > > > >

> > > > > > > > wrote:

> > > > > > > > > >

> > > > > > > > > > > Hi Till,

> > > > > > > > > > >

> > > > > > > > > > > For FLINK-8562, the test is passing now because it's

> not

> > > > really

> > > > > > > > > > > checking the right thing.

> > > > > > > > > > >

> > > > > > > > > > > Yes, I can help with the Kerberos integration ticket.

> > > > > > > > > > >

> > > > > > > > > > > Is there an example on how the e2e test should be

> > > structured

> > > > > and

> > > > > > > > > invoked?

> > > > > > > > > > >

> > > > > > > > > > > Thanks

> > > > > > > > > > > Shuyi

> > > > > > > > > > >

> > > > > > > > > > > On Fri, Mar 16, 2018 at 6:51 AM, Till Rohrmann <

> > > > > > > [hidden email]

> > > > > > > > >

> > > > > > > > > > > wrote:

> > > > > > > > > > >

> > > > > > > > > > > > Hi Shuyi,

> > > > > > > > > > > >

> > > > > > > > > > > > thanks for the working on FLINK-8562. Once this issue

> > is

> > > > > fixed,

> > > > > > > it

> > > > > > > > > will

> > > > > > > > > > > > automatically be executed on the Flip-6 components.

> In

> > > fact

> > > > > it

> > > > > > is

> > > > > > > > > > already

> > > > > > > > > > > > being executed on Flip-6.

> > > > > > > > > > > >

> > > > > > > > > > > > But what you could help the community with is setting

> > up

> > > an

> > > > > > > > automated

> > > > > > > > > > > > end-to-end test for the Kerberos integration if you

> > want:

> > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8981.

> > > > > > > > > > > >

> > > > > > > > > > > > The Flink community is currently working on

> automating

> > > more

> > > > > and

> > > > > > > > more

> > > > > > > > > > > tests

> > > > > > > > > > > > in order to facilitate faster releases and improve

> the

> > > test

> > > > > > > > coverage.

> > > > > > > > > > You

> > > > > > > > > > > > can find more about this effort here:

> > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8970.

> > > > > > > > > > > >

> > > > > > > > > > > > Cheers,

> > > > > > > > > > > > Till

> > > > > > > > > > > >

> > > > > > > > > > > > On Thu, Mar 15, 2018 at 8:45 PM, Shuyi Chen <

> > > > > > [hidden email]>

> > > > > > > > > > wrote:

> > > > > > > > > > > >

> > > > > > > > > > > > > Hi Till,

> > > > > > > > > > > > >

> > > > > > > > > > > > > This is Shuyi :) Thanks a lot. In FLINK-8562, I

> > already

> > > > > sent

> > > > > > a

> > > > > > > PR

> > > > > > > > > to

> > > > > > > > > > > > > resolve the issue, your help to take a look will be

> > > > great.

> > > > > > > > > > > > >

> > > > > > > > > > > > > Please let me know what I can help to test the

> > Kerberos

> > > > > > > > > > > authentication, I

> > > > > > > > > > > > > am decently familiar with the Kerberos and YARN

> > > security

> > > > > part

> > > > > > > in

> > > > > > > > > > Flink.

> > > > > > > > > > > > >

> > > > > > > > > > > > > As a starting point, I'd suggest to add an

> > integration

> > > > test

> > > > > > > > similar

> > > > > > > > > > to

> > > > > > > > > > > > > YARNSessionFIFOSecuredITCase

> > > > > > > > > > > > > for flip6.

> > > > > > > > > > > > >

> > > > > > > > > > > > > Shuyi

> > > > > > > > > > > > >

> > > > > > > > > > > > > On Thu, Mar 15, 2018 at 5:44 AM, Till Rohrmann <

> > > > > > > > > [hidden email]

> > > > > > > > > > >

> > > > > > > > > > > > > wrote:

> > > > > > > > > > > > >

> > > > > > > > > > > > > > Hi Renjie,

> > > > > > > > > > > > > >

> > > > > > > > > > > > > > thanks for the pointer with the

> > > > > > YARNSessionFIFOSecuredITCase.

> > > > > > > > > > You're

> > > > > > > > > > > > > right

> > > > > > > > > > > > > > that we should fix this test. There is FLINK-8562

> > > which

> > > > > > seems

> > > > > > > > to

> > > > > > > > > > > > address

> > > > > > > > > > > > > > the problem. Will take a look.

> > > > > > > > > > > > > >

> > > > > > > > > > > > > > Additionally, we want to test Kerberos

> > authentication

> > > > > > > > explicitly

> > > > > > > > > as

> > > > > > > > > > > > part

> > > > > > > > > > > > > of

> > > > > > > > > > > > > > the release testing for Flink 1.5. I will shortly

> > > send

> > > > > > > around a

> > > > > > > > > > mail

> > > > > > > > > > > > > where

> > > > > > > > > > > > > > I will lay out the ongoing testing efforts and

> > where

> > > > more

> > > > > > is

> > > > > > > > > > needed.

> > > > > > > > > > > > > >

> > > > > > > > > > > > > > Cheers,

> > > > > > > > > > > > > > Till

> > > > > > > > > > > > > >

> > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 7:37 AM, Renjie Liu <

> > > > > > > > > > [hidden email]

> > > > > > > > > > > >

> > > > > > > &
Reply | Threaded
Open this post in threaded view
|

Re: Flip 6 mesos support

Renjie Liu
Hi:
Could you please help to check whether there any mistake in te config? If
not, I'll go to file a bug in jira.

Ted Yu <[hidden email]> 于 2018年3月23日周五 下午7:16写道:

> Renjie:The log didn't go through.
> Consider logging a JIRA and attach the log there.
> Thanks
> -------- Original message --------From: Renjie Liu <
> [hidden email]> Date: 3/23/18  1:38 AM  (GMT-08:00) To:
> [hidden email] Subject: Re: Flip 6 mesos support
> Hi, Till:Attached is my log.
> I'm also looking into this, could you please assign this bug to me? I'm
> also trying to contribute to flink.
>
> On Fri, Mar 23, 2018 at 4:11 PM Till Rohrmann <[hidden email]>
> wrote:
> HI Renjie, could you share the logs with us? This sounds like a bug we
>
> should fix.
>
>
>
> Cheers,
>
> Till
>
>
>
> On Fri, Mar 23, 2018 at 4:42 AM, Renjie Liu <[hidden email]>
> wrote:
>
>
>
> > Hi, Till:
>
> > Has anybody succeeded to deploy flip 6 mode on mesos?
>
> >
>
> > I'm testing flip 6 using the master branch and I just can't run jobs. The
>
> > following are my configurations:
>
> >
>
> > *jobmanager.rpc.address: qt9ss.prod.mediav.com
>
> > <http://qt9ss.prod.mediav.com>*
>
> > *jobmanager.rpc.port: 6123*
>
> > *jobmanager.heap.mb: 1024*
>
> > *taskmanager.heap.mb: 1024*
>
> > *taskmanager.numberOfTaskSlots: 5*
>
> > *parallelism.default: 1*
>
> > *web.port: 8081*
>
> > *mesos.master: zk://dk71ss.jx.shbt2.qihoo.net:2191
>
> > <http://dk71ss.jx.shbt2.qihoo.net:2191>,dk72ss.jx.shbt2.qihoo.net:2191
>
> > <http://dk72ss.jx.shbt2.qihoo.net:2191>,dk5ss.jx.shbt2.
>
> > qihoo.net:2191/mesos
>
> > <http://dk5ss.jx.shbt2.qihoo.net:2191/mesos>*
>
> > *mesos.resourcemanager.tasks.container.type: docker*
>
> > *mesos.resourcemanager.tasks.container.image.name
>
> > <http://mesos.resourcemanager.tasks.container.image.name>:
>
> > dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT
>
> > <http://dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT>*
>
> > *mesos.resourcemanager.framework.user: mediav*
>
> > *mesos.resourcemanager.tasks.cpus: 5*
>
> > *mesos.resourcemanager.tasks.mem: 10240*
>
> > *mesos.resourcemanager.framework.name
>
> > <http://mesos.resourcemanager.framework.name>: Flink*
>
> > *mesos.failover-timeout: 60*
>
> >
>
> > From the mesos side, I can see that when I submit a job, flink master
> will
>
> > request a contianer with 5 cores. But the job submission still fails the
>
> > following error:
>
> > *org.apache.flink.runtime.jobmanager.scheduler.
>
> > NoResourceAvailableException:
>
> > Could not allocate all requires slots within timeout of 300000 ms. Slots
>
> > required: 1, slots allocated: 0*
>
> >
>
> > My job only requires 1 slot but job manager keeps reporting that no slots
>
> > avaiable.
>
> >
>
> > On Wed, Mar 21, 2018 at 10:42 PM Till Rohrmann <[hidden email]>
>
> > wrote:
>
> >
>
> > > The resources consumed by the JobMaster can be specified by
>
> > > `jobmanager.heap.mb`.
>
> > >
>
> > > Cheers,
>
> > > Till
>
> > >
>
> > > On Wed, Mar 21, 2018 at 3:20 PM, Renjie Liu <[hidden email]>
>
> > > wrote:
>
> > >
>
> > > > Hi, Till:
>
> > > >
>
> > > > In fact, I want to ask the resources consume by job manager
>
> > > >
>
> > > > Till Rohrmann <[hidden email]> 于 2018年3月21日周三 下午8:17写道:
>
> > > >
>
> > > > > As many as the application needs to run. If you start a job with
>
> > > > > parallelism 10 then it will ask for 10 slots (assuming slot
> sharing).
>
> > > > >
>
> > > > > On Wed, Mar 21, 2018 at 12:04 PM, Renjie Liu <
>
> > [hidden email]>
>
> > > > > wrote:
>
> > > > >
>
> > > > > > So how many slots a job manager may consume?
>
> > > > > >
>
> > > > > > On Wed, Mar 21, 2018 at 6:50 PM Till Rohrmann <
>
> > [hidden email]>
>
> > > > > > wrote:
>
> > > > > >
>
> > > > > > > At the moment this is not possible. In order to do this, you
> will
>
> > > > have
>
> > > > > to
>
> > > > > > > use the per-job mode and run each job on a dedicated Flink
>
> > cluster.
>
> > > > > > >
>
> > > > > > > On Wed, Mar 21, 2018 at 11:33 AM, Renjie Liu <
>
> > > > [hidden email]>
>
> > > > > > > wrote:
>
> > > > > > >
>
> > > > > > > > For example, we have 2 jobs.
>
> > > > > > > > For job 1, I want to start job manger with 1 CPU and 100M
>
> > memory.
>
> > > > > Job 1
>
> > > > > > > > need s10 slots, and I want to deploy these 10 slots in 2 task
>
> > > > > managers,
>
> > > > > > > > each with 5 cores and 1G memory.
>
> > > > > > > >
>
> > > > > > > > For job 2, I want to start job manager with 2 CPU and 200M
>
> > > memory.
>
> > > > > Job
>
> > > > > > 2
>
> > > > > > > > needs 100 slots and I want to deploy these 100 slot in 10
> task
>
> > > > > > managers,
>
> > > > > > > > each with 10 cores and 2G memory.
>
> > > > > > > >
>
> > > > > > > > Is this possible?
>
> > > > > > > >
>
> > > > > > > > On Wed, Mar 21, 2018 at 6:19 PM Till Rohrmann <
>
> > > > [hidden email]>
>
> > > > > > > > wrote:
>
> > > > > > > >
>
> > > > > > > > > Hi Renjie,
>
> > > > > > > > >
>
> > > > > > > > > what do you mean with specifying different JM and TM
>
> > resources
>
> > > > for
>
> > > > > > > > > different jobs exactly?
>
> > > > > > > > >
>
> > > > > > > > > Cheers,
>
> > > > > > > > > Till
>
> > > > > > > > >
>
> > > > > > > > > On Wed, Mar 21, 2018 at 10:55 AM, Renjie Liu <
>
> > > > > > [hidden email]>
>
> > > > > > > > > wrote:
>
> > > > > > > > >
>
> > > > > > > > > > Hi, Till:
>
> > > > > > > > > >
>
> > > > > > > > > > How to specify job manager and task manager resources for
>
> > > > > different
>
> > > > > > > > jobs
>
> > > > > > > > > in
>
> > > > > > > > > > session mode?
>
> > > > > > > > > >
>
> > > > > > > > > > On Sun, Mar 18, 2018 at 1:10 AM Till Rohrmann <
>
> > > > > > [hidden email]>
>
> > > > > > > > > > wrote:
>
> > > > > > > > > >
>
> > > > > > > > > > > Hi Shuyi,
>
> > > > > > > > > > >
>
> > > > > > > > > > > best if you look at the other e2e tests in the
>
> > > > > > > flink-end-to-end-tests
>
> > > > > > > > > > > module. For example the Kafka e2e test under
>
> > > > > > > > > > > flink/flink-end-to-end-tests/
>
> > test-scripts/test_streaming_
>
> > > > > > > > kafka010.sh.
>
> > > > > > > > > > >
>
> > > > > > > > > > > Cheers,
>
> > > > > > > > > > > Till
>
> > > > > > > > > > >
>
> > > > > > > > > > > On Fri, Mar 16, 2018 at 10:20 PM, Shuyi Chen <
>
> > > > > [hidden email]
>
> > > > > > >
>
> > > > > > > > > wrote:
>
> > > > > > > > > > >
>
> > > > > > > > > > > > Hi Till,
>
> > > > > > > > > > > >
>
> > > > > > > > > > > > For FLINK-8562, the test is passing now because it's
>
> > not
>
> > > > > really
>
> > > > > > > > > > > > checking the right thing.
>
> > > > > > > > > > > >
>
> > > > > > > > > > > > Yes, I can help with the Kerberos integration ticket.
>
> > > > > > > > > > > >
>
> > > > > > > > > > > > Is there an example on how the e2e test should be
>
> > > > structured
>
> > > > > > and
>
> > > > > > > > > > invoked?
>
> > > > > > > > > > > >
>
> > > > > > > > > > > > Thanks
>
> > > > > > > > > > > > Shuyi
>
> > > > > > > > > > > >
>
> > > > > > > > > > > > On Fri, Mar 16, 2018 at 6:51 AM, Till Rohrmann <
>
> > > > > > > > [hidden email]
>
> > > > > > > > > >
>
> > > > > > > > > > > > wrote:
>
> > > > > > > > > > > >
>
> > > > > > > > > > > > > Hi Shuyi,
>
> > > > > > > > > > > > >
>
> > > > > > > > > > > > > thanks for the working on FLINK-8562. Once this
> issue
>
> > > is
>
> > > > > > fixed,
>
> > > > > > > > it
>
> > > > > > > > > > will
>
> > > > > > > > > > > > > automatically be executed on the Flip-6 components.
>
> > In
>
> > > > fact
>
> > > > > > it
>
> > > > > > > is
>
> > > > > > > > > > > already
>
> > > > > > > > > > > > > being executed on Flip-6.
>
> > > > > > > > > > > > >
>
> > > > > > > > > > > > > But what you could help the community with is
> setting
>
> > > up
>
> > > > an
>
> > > > > > > > > automated
>
> > > > > > > > > > > > > end-to-end test for the Kerberos integration if you
>
> > > want:
>
> > > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8981.
>
> > > > > > > > > > > > >
>
> > > > > > > > > > > > > The Flink community is currently working on
>
> > automating
>
> > > > more
>
> > > > > > and
>
> > > > > > > > > more
>
> > > > > > > > > > > > tests
>
> > > > > > > > > > > > > in order to facilitate faster releases and improve
>
> > the
>
> > > > test
>
> > > > > > > > > coverage.
>
> > > > > > > > > > > You
>
> > > > > > > > > > > > > can find more about this effort here:
>
> > > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8970.
>
> > > > > > > > > > > > >
>
> > > > > > > > > > > > > Cheers,
>
> > > > > > > > > > > > > Till
>
> > > > > > > > > > > > >
>
> > > > > > > > > > > > > On Thu, Mar 15, 2018 at 8:45 PM, Shuyi Chen <
>
> > > > > > > [hidden email]>
>
> > > > > > > > > > > wrote:
>
> > > > > > > > > > > > >
>
> > > > > > > > > > > > > > Hi Till,
>
> > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > This is Shuyi :) Thanks a lot. In FLINK-8562, I
>
> > > already
>
> > > > > > sent
>
> > > > > > > a
>
> > > > > > > > PR
>
> > > > > > > > > > to
>
> > > > > > > > > > > > > > resolve the issue, your help to take a look will
> be
>
> > > > > great.
>
> > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > Please let me know what I can help to test the
>
> > > Kerberos
>
> > > > > > > > > > > > authentication, I
>
> > > > > > > > > > > > > > am decently familiar with the Kerberos and YARN
>
> > > > security
>
> > > > > > part
>
> > > > > > > > in
>
> > > > > > > > > > > Flink.
>
> > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > As a starting point, I'd suggest to add an
>
> > > integration
>
> > > > > test
>
> > > > > > > > > similar
>
> > > > > > > > > > > to
>
> > > > > > > > > > > > > > YARNSessionFIFOSecuredITCase
>
> > > > > > > > > > > > > > for flip6.
>
> > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > Shuyi
>
> > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 5:44 AM, Till Rohrmann <
>
> > > > > > > > > > [hidden email]
>
> > > > > > > > > > > >
>
> > > > > > > > > > > > > > wrote:
>
> > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > > Hi Renjie,
>
> > > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > > thanks for the pointer with the
>
> > > > > > > YARNSessionFIFOSecuredITCase.
>
> > > > > > > > > > > You're
>
> > > > > > > > > > > > > > right
>
> > > > > > > > > > > > > > > that we should fix this test. There is
> FLINK-8562
>
> > > > which
>
> > > > > > > seems
>
> > > > > > > > > to
>
> > > > > > > > > > > > > address
>
> > > > > > > > > > > > > > > the problem. Will take a look.
>
> > > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > > Additionally, we want to test Kerberos
>
> > > authentication
>
> > > > > > > > > explicitly
>
> > > > > > > > > > as
>
> > > > > > > > > > > > > part
>
> > > > > > > > > > > > > > of
>
> > > > > > > > > > > > > > > the release testing for Flink 1.5. I will
> shortly
>
> > > > send
>
> > > > > > > > around a
>
> > > > > > > > > > > mail
>
> > > > > > > > > > > > > > where
>
> > > > > > > > > > > > > > > I will lay out the ongoing testing efforts and
>
> > > where
>
> > > > > more
>
> > > > > > > is
>
> > > > > > > > > > > needed.
>
> > > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > > Cheers,
>
> > > > > > > > > > > > > > > Till
>
> > > > > > > > > > > > > > >
>
> > > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 7:37 AM, Renjie Liu <
>
> > > > > > > > > > > [hidden email]
>
> > > > > > > > > > > > >
>
> > > > > > > > &

--
Liu, Renjie
Software Engineer, MVAD
Reply | Threaded
Open this post in threaded view
|

Re: Flip 6 mesos support

Till Rohrmann
Hi Renjie,

we couldn't take a look at your configuration yet, because the ML filters
attachments out. You could upload your log to https://gist.github.com/ or
open a JIRA to which you attach the log.

Cheers,
Till

On Sun, Mar 25, 2018 at 9:04 AM, Renjie Liu <[hidden email]> wrote:

> Hi:
> Could you please help to check whether there any mistake in te config? If
> not, I'll go to file a bug in jira.
>
> Ted Yu <[hidden email]> 于 2018年3月23日周五 下午7:16写道:
>
> > Renjie:The log didn't go through.
> > Consider logging a JIRA and attach the log there.
> > Thanks
> > -------- Original message --------From: Renjie Liu <
> > [hidden email]> Date: 3/23/18  1:38 AM  (GMT-08:00) To:
> > [hidden email] Subject: Re: Flip 6 mesos support
> > Hi, Till:Attached is my log.
> > I'm also looking into this, could you please assign this bug to me? I'm
> > also trying to contribute to flink.
> >
> > On Fri, Mar 23, 2018 at 4:11 PM Till Rohrmann <[hidden email]>
> > wrote:
> > HI Renjie, could you share the logs with us? This sounds like a bug we
> >
> > should fix.
> >
> >
> >
> > Cheers,
> >
> > Till
> >
> >
> >
> > On Fri, Mar 23, 2018 at 4:42 AM, Renjie Liu <[hidden email]>
> > wrote:
> >
> >
> >
> > > Hi, Till:
> >
> > > Has anybody succeeded to deploy flip 6 mode on mesos?
> >
> > >
> >
> > > I'm testing flip 6 using the master branch and I just can't run jobs.
> The
> >
> > > following are my configurations:
> >
> > >
> >
> > > *jobmanager.rpc.address: qt9ss.prod.mediav.com
> >
> > > <http://qt9ss.prod.mediav.com>*
> >
> > > *jobmanager.rpc.port: 6123*
> >
> > > *jobmanager.heap.mb: 1024*
> >
> > > *taskmanager.heap.mb: 1024*
> >
> > > *taskmanager.numberOfTaskSlots: 5*
> >
> > > *parallelism.default: 1*
> >
> > > *web.port: 8081*
> >
> > > *mesos.master: zk://dk71ss.jx.shbt2.qihoo.net:2191
> >
> > > <http://dk71ss.jx.shbt2.qihoo.net:2191>,dk72ss.jx.shbt2.qihoo.net:2191
> >
> > > <http://dk72ss.jx.shbt2.qihoo.net:2191>,dk5ss.jx.shbt2.
> >
> > > qihoo.net:2191/mesos
> >
> > > <http://dk5ss.jx.shbt2.qihoo.net:2191/mesos>*
> >
> > > *mesos.resourcemanager.tasks.container.type: docker*
> >
> > > *mesos.resourcemanager.tasks.container.image.name
> >
> > > <http://mesos.resourcemanager.tasks.container.image.name>:
> >
> > > dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT
> >
> > > <http://dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT>*
> >
> > > *mesos.resourcemanager.framework.user: mediav*
> >
> > > *mesos.resourcemanager.tasks.cpus: 5*
> >
> > > *mesos.resourcemanager.tasks.mem: 10240*
> >
> > > *mesos.resourcemanager.framework.name
> >
> > > <http://mesos.resourcemanager.framework.name>: Flink*
> >
> > > *mesos.failover-timeout: 60*
> >
> > >
> >
> > > From the mesos side, I can see that when I submit a job, flink master
> > will
> >
> > > request a contianer with 5 cores. But the job submission still fails
> the
> >
> > > following error:
> >
> > > *org.apache.flink.runtime.jobmanager.scheduler.
> >
> > > NoResourceAvailableException:
> >
> > > Could not allocate all requires slots within timeout of 300000 ms.
> Slots
> >
> > > required: 1, slots allocated: 0*
> >
> > >
> >
> > > My job only requires 1 slot but job manager keeps reporting that no
> slots
> >
> > > avaiable.
> >
> > >
> >
> > > On Wed, Mar 21, 2018 at 10:42 PM Till Rohrmann <[hidden email]>
> >
> > > wrote:
> >
> > >
> >
> > > > The resources consumed by the JobMaster can be specified by
> >
> > > > `jobmanager.heap.mb`.
> >
> > > >
> >
> > > > Cheers,
> >
> > > > Till
> >
> > > >
> >
> > > > On Wed, Mar 21, 2018 at 3:20 PM, Renjie Liu <[hidden email]
> >
> >
> > > > wrote:
> >
> > > >
> >
> > > > > Hi, Till:
> >
> > > > >
> >
> > > > > In fact, I want to ask the resources consume by job manager
> >
> > > > >
> >
> > > > > Till Rohrmann <[hidden email]> 于 2018年3月21日周三 下午8:17写道:
> >
> > > > >
> >
> > > > > > As many as the application needs to run. If you start a job with
> >
> > > > > > parallelism 10 then it will ask for 10 slots (assuming slot
> > sharing).
> >
> > > > > >
> >
> > > > > > On Wed, Mar 21, 2018 at 12:04 PM, Renjie Liu <
> >
> > > [hidden email]>
> >
> > > > > > wrote:
> >
> > > > > >
> >
> > > > > > > So how many slots a job manager may consume?
> >
> > > > > > >
> >
> > > > > > > On Wed, Mar 21, 2018 at 6:50 PM Till Rohrmann <
> >
> > > [hidden email]>
> >
> > > > > > > wrote:
> >
> > > > > > >
> >
> > > > > > > > At the moment this is not possible. In order to do this, you
> > will
> >
> > > > > have
> >
> > > > > > to
> >
> > > > > > > > use the per-job mode and run each job on a dedicated Flink
> >
> > > cluster.
> >
> > > > > > > >
> >
> > > > > > > > On Wed, Mar 21, 2018 at 11:33 AM, Renjie Liu <
> >
> > > > > [hidden email]>
> >
> > > > > > > > wrote:
> >
> > > > > > > >
> >
> > > > > > > > > For example, we have 2 jobs.
> >
> > > > > > > > > For job 1, I want to start job manger with 1 CPU and 100M
> >
> > > memory.
> >
> > > > > > Job 1
> >
> > > > > > > > > need s10 slots, and I want to deploy these 10 slots in 2
> task
> >
> > > > > > managers,
> >
> > > > > > > > > each with 5 cores and 1G memory.
> >
> > > > > > > > >
> >
> > > > > > > > > For job 2, I want to start job manager with 2 CPU and 200M
> >
> > > > memory.
> >
> > > > > > Job
> >
> > > > > > > 2
> >
> > > > > > > > > needs 100 slots and I want to deploy these 100 slot in 10
> > task
> >
> > > > > > > managers,
> >
> > > > > > > > > each with 10 cores and 2G memory.
> >
> > > > > > > > >
> >
> > > > > > > > > Is this possible?
> >
> > > > > > > > >
> >
> > > > > > > > > On Wed, Mar 21, 2018 at 6:19 PM Till Rohrmann <
> >
> > > > > [hidden email]>
> >
> > > > > > > > > wrote:
> >
> > > > > > > > >
> >
> > > > > > > > > > Hi Renjie,
> >
> > > > > > > > > >
> >
> > > > > > > > > > what do you mean with specifying different JM and TM
> >
> > > resources
> >
> > > > > for
> >
> > > > > > > > > > different jobs exactly?
> >
> > > > > > > > > >
> >
> > > > > > > > > > Cheers,
> >
> > > > > > > > > > Till
> >
> > > > > > > > > >
> >
> > > > > > > > > > On Wed, Mar 21, 2018 at 10:55 AM, Renjie Liu <
> >
> > > > > > > [hidden email]>
> >
> > > > > > > > > > wrote:
> >
> > > > > > > > > >
> >
> > > > > > > > > > > Hi, Till:
> >
> > > > > > > > > > >
> >
> > > > > > > > > > > How to specify job manager and task manager resources
> for
> >
> > > > > > different
> >
> > > > > > > > > jobs
> >
> > > > > > > > > > in
> >
> > > > > > > > > > > session mode?
> >
> > > > > > > > > > >
> >
> > > > > > > > > > > On Sun, Mar 18, 2018 at 1:10 AM Till Rohrmann <
> >
> > > > > > > [hidden email]>
> >
> > > > > > > > > > > wrote:
> >
> > > > > > > > > > >
> >
> > > > > > > > > > > > Hi Shuyi,
> >
> > > > > > > > > > > >
> >
> > > > > > > > > > > > best if you look at the other e2e tests in the
> >
> > > > > > > > flink-end-to-end-tests
> >
> > > > > > > > > > > > module. For example the Kafka e2e test under
> >
> > > > > > > > > > > > flink/flink-end-to-end-tests/
> >
> > > test-scripts/test_streaming_
> >
> > > > > > > > > kafka010.sh.
> >
> > > > > > > > > > > >
> >
> > > > > > > > > > > > Cheers,
> >
> > > > > > > > > > > > Till
> >
> > > > > > > > > > > >
> >
> > > > > > > > > > > > On Fri, Mar 16, 2018 at 10:20 PM, Shuyi Chen <
> >
> > > > > > [hidden email]
> >
> > > > > > > >
> >
> > > > > > > > > > wrote:
> >
> > > > > > > > > > > >
> >
> > > > > > > > > > > > > Hi Till,
> >
> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > For FLINK-8562, the test is passing now because
> it's
> >
> > > not
> >
> > > > > > really
> >
> > > > > > > > > > > > > checking the right thing.
> >
> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > Yes, I can help with the Kerberos integration
> ticket.
> >
> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > Is there an example on how the e2e test should be
> >
> > > > > structured
> >
> > > > > > > and
> >
> > > > > > > > > > > invoked?
> >
> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > Thanks
> >
> > > > > > > > > > > > > Shuyi
> >
> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > On Fri, Mar 16, 2018 at 6:51 AM, Till Rohrmann <
> >
> > > > > > > > > [hidden email]
> >
> > > > > > > > > > >
> >
> > > > > > > > > > > > > wrote:
> >
> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > Hi Shuyi,
> >
> > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > thanks for the working on FLINK-8562. Once this
> > issue
> >
> > > > is
> >
> > > > > > > fixed,
> >
> > > > > > > > > it
> >
> > > > > > > > > > > will
> >
> > > > > > > > > > > > > > automatically be executed on the Flip-6
> components.
> >
> > > In
> >
> > > > > fact
> >
> > > > > > > it
> >
> > > > > > > > is
> >
> > > > > > > > > > > > already
> >
> > > > > > > > > > > > > > being executed on Flip-6.
> >
> > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > But what you could help the community with is
> > setting
> >
> > > > up
> >
> > > > > an
> >
> > > > > > > > > > automated
> >
> > > > > > > > > > > > > > end-to-end test for the Kerberos integration if
> you
> >
> > > > want:
> >
> > > > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8981
> .
> >
> > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > The Flink community is currently working on
> >
> > > automating
> >
> > > > > more
> >
> > > > > > > and
> >
> > > > > > > > > > more
> >
> > > > > > > > > > > > > tests
> >
> > > > > > > > > > > > > > in order to facilitate faster releases and
> improve
> >
> > > the
> >
> > > > > test
> >
> > > > > > > > > > coverage.
> >
> > > > > > > > > > > > You
> >
> > > > > > > > > > > > > > can find more about this effort here:
> >
> > > > > > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-8970
> .
> >
> > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > Cheers,
> >
> > > > > > > > > > > > > > Till
> >
> > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 8:45 PM, Shuyi Chen <
> >
> > > > > > > > [hidden email]>
> >
> > > > > > > > > > > > wrote:
> >
> > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > Hi Till,
> >
> > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > This is Shuyi :) Thanks a lot. In FLINK-8562, I
> >
> > > > already
> >
> > > > > > > sent
> >
> > > > > > > > a
> >
> > > > > > > > > PR
> >
> > > > > > > > > > > to
> >
> > > > > > > > > > > > > > > resolve the issue, your help to take a look
> will
> > be
> >
> > > > > > great.
> >
> > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > Please let me know what I can help to test the
> >
> > > > Kerberos
> >
> > > > > > > > > > > > > authentication, I
> >
> > > > > > > > > > > > > > > am decently familiar with the Kerberos and YARN
> >
> > > > > security
> >
> > > > > > > part
> >
> > > > > > > > > in
> >
> > > > > > > > > > > > Flink.
> >
> > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > As a starting point, I'd suggest to add an
> >
> > > > integration
> >
> > > > > > test
> >
> > > > > > > > > > similar
> >
> > > > > > > > > > > > to
> >
> > > > > > > > > > > > > > > YARNSessionFIFOSecuredITCase
> >
> > > > > > > > > > > > > > > for flip6.
> >
> > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > Shuyi
> >
> > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 5:44 AM, Till Rohrmann
> <
> >
> > > > > > > > > > > [hidden email]
> >
> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > wrote:
> >
> > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > Hi Renjie,
> >
> > > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > thanks for the pointer with the
> >
> > > > > > > > YARNSessionFIFOSecuredITCase.
> >
> > > > > > > > > > > > You're
> >
> > > > > > > > > > > > > > > right
> >
> > > > > > > > > > > > > > > > that we should fix this test. There is
> > FLINK-8562
> >
> > > > > which
> >
> > > > > > > > seems
> >
> > > > > > > > > > to
> >
> > > > > > > > > > > > > > address
> >
> > > > > > > > > > > > > > > > the problem. Will take a look.
> >
> > > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > Additionally, we want to test Kerberos
> >
> > > > authentication
> >
> > > > > > > > > > explicitly
> >
> > > > > > > > > > > as
> >
> > > > > > > > > > > > > > part
> >
> > > > > > > > > > > > > > > of
> >
> > > > > > > > > > > > > > > > the release testing for Flink 1.5. I will
> > shortly
> >
> > > > > send
> >
> > > > > > > > > around a
> >
> > > > > > > > > > > > mail
> >
> > > > > > > > > > > > > > > where
> >
> > > > > > > > > > > > > > > > I will lay out the ongoing testing efforts
> and
> >
> > > > where
> >
> > > > > > more
> >
> > > > > > > > is
> >
> > > > > > > > > > > > needed.
> >
> > > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > Cheers,
> >
> > > > > > > > > > > > > > > > Till
> >
> > > > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 7:37 AM, Renjie Liu <
> >
> > > > > > > > > > > > [hidden email]
> >
> > > > > > > > > > > > > >
> >
> > > > > > > > > &
>
> --
> Liu, Renjie
> Software Engineer, MVAD
>
Reply | Threaded
Open this post in threaded view
|

Re: Flip 6 mesos support

Renjie Liu
Hi, Till:
Issued opened here, https://issues.apache.org/jira/browse/FLINK-9077

On Tue, Mar 27, 2018 at 11:31 PM Till Rohrmann <[hidden email]> wrote:

> Hi Renjie,
>
> we couldn't take a look at your configuration yet, because the ML filters
> attachments out. You could upload your log to https://gist.github.com/ or
> open a JIRA to which you attach the log.
>
> Cheers,
> Till
>
> On Sun, Mar 25, 2018 at 9:04 AM, Renjie Liu <[hidden email]>
> wrote:
>
> > Hi:
> > Could you please help to check whether there any mistake in te config? If
> > not, I'll go to file a bug in jira.
> >
> > Ted Yu <[hidden email]> 于 2018年3月23日周五 下午7:16写道:
> >
> > > Renjie:The log didn't go through.
> > > Consider logging a JIRA and attach the log there.
> > > Thanks
> > > -------- Original message --------From: Renjie Liu <
> > > [hidden email]> Date: 3/23/18  1:38 AM  (GMT-08:00) To:
> > > [hidden email] Subject: Re: Flip 6 mesos support
> > > Hi, Till:Attached is my log.
> > > I'm also looking into this, could you please assign this bug to me? I'm
> > > also trying to contribute to flink.
> > >
> > > On Fri, Mar 23, 2018 at 4:11 PM Till Rohrmann <[hidden email]>
> > > wrote:
> > > HI Renjie, could you share the logs with us? This sounds like a bug we
> > >
> > > should fix.
> > >
> > >
> > >
> > > Cheers,
> > >
> > > Till
> > >
> > >
> > >
> > > On Fri, Mar 23, 2018 at 4:42 AM, Renjie Liu <[hidden email]>
> > > wrote:
> > >
> > >
> > >
> > > > Hi, Till:
> > >
> > > > Has anybody succeeded to deploy flip 6 mode on mesos?
> > >
> > > >
> > >
> > > > I'm testing flip 6 using the master branch and I just can't run jobs.
> > The
> > >
> > > > following are my configurations:
> > >
> > > >
> > >
> > > > *jobmanager.rpc.address: qt9ss.prod.mediav.com
> > >
> > > > <http://qt9ss.prod.mediav.com>*
> > >
> > > > *jobmanager.rpc.port: 6123*
> > >
> > > > *jobmanager.heap.mb: 1024*
> > >
> > > > *taskmanager.heap.mb: 1024*
> > >
> > > > *taskmanager.numberOfTaskSlots: 5*
> > >
> > > > *parallelism.default: 1*
> > >
> > > > *web.port: 8081*
> > >
> > > > *mesos.master: zk://dk71ss.jx.shbt2.qihoo.net:2191
> > >
> > > > <http://dk71ss.jx.shbt2.qihoo.net:2191>,
> dk72ss.jx.shbt2.qihoo.net:2191
> > >
> > > > <http://dk72ss.jx.shbt2.qihoo.net:2191>,dk5ss.jx.shbt2.
> > >
> > > > qihoo.net:2191/mesos
> > >
> > > > <http://dk5ss.jx.shbt2.qihoo.net:2191/mesos>*
> > >
> > > > *mesos.resourcemanager.tasks.container.type: docker*
> > >
> > > > *mesos.resourcemanager.tasks.container.image.name
> > >
> > > > <http://mesos.resourcemanager.tasks.container.image.name>:
> > >
> > > > dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT
> > >
> > > > <http://dk1ss.prod.mediav.com:5000/adq/flink:1.6.0-SNAPSHOT>*
> > >
> > > > *mesos.resourcemanager.framework.user: mediav*
> > >
> > > > *mesos.resourcemanager.tasks.cpus: 5*
> > >
> > > > *mesos.resourcemanager.tasks.mem: 10240*
> > >
> > > > *mesos.resourcemanager.framework.name
> > >
> > > > <http://mesos.resourcemanager.framework.name>: Flink*
> > >
> > > > *mesos.failover-timeout: 60*
> > >
> > > >
> > >
> > > > From the mesos side, I can see that when I submit a job, flink master
> > > will
> > >
> > > > request a contianer with 5 cores. But the job submission still fails
> > the
> > >
> > > > following error:
> > >
> > > > *org.apache.flink.runtime.jobmanager.scheduler.
> > >
> > > > NoResourceAvailableException:
> > >
> > > > Could not allocate all requires slots within timeout of 300000 ms.
> > Slots
> > >
> > > > required: 1, slots allocated: 0*
> > >
> > > >
> > >
> > > > My job only requires 1 slot but job manager keeps reporting that no
> > slots
> > >
> > > > avaiable.
> > >
> > > >
> > >
> > > > On Wed, Mar 21, 2018 at 10:42 PM Till Rohrmann <[hidden email]
> >
> > >
> > > > wrote:
> > >
> > > >
> > >
> > > > > The resources consumed by the JobMaster can be specified by
> > >
> > > > > `jobmanager.heap.mb`.
> > >
> > > > >
> > >
> > > > > Cheers,
> > >
> > > > > Till
> > >
> > > > >
> > >
> > > > > On Wed, Mar 21, 2018 at 3:20 PM, Renjie Liu <
> [hidden email]
> > >
> > >
> > > > > wrote:
> > >
> > > > >
> > >
> > > > > > Hi, Till:
> > >
> > > > > >
> > >
> > > > > > In fact, I want to ask the resources consume by job manager
> > >
> > > > > >
> > >
> > > > > > Till Rohrmann <[hidden email]> 于 2018年3月21日周三 下午8:17写道:
> > >
> > > > > >
> > >
> > > > > > > As many as the application needs to run. If you start a job
> with
> > >
> > > > > > > parallelism 10 then it will ask for 10 slots (assuming slot
> > > sharing).
> > >
> > > > > > >
> > >
> > > > > > > On Wed, Mar 21, 2018 at 12:04 PM, Renjie Liu <
> > >
> > > > [hidden email]>
> > >
> > > > > > > wrote:
> > >
> > > > > > >
> > >
> > > > > > > > So how many slots a job manager may consume?
> > >
> > > > > > > >
> > >
> > > > > > > > On Wed, Mar 21, 2018 at 6:50 PM Till Rohrmann <
> > >
> > > > [hidden email]>
> > >
> > > > > > > > wrote:
> > >
> > > > > > > >
> > >
> > > > > > > > > At the moment this is not possible. In order to do this,
> you
> > > will
> > >
> > > > > > have
> > >
> > > > > > > to
> > >
> > > > > > > > > use the per-job mode and run each job on a dedicated Flink
> > >
> > > > cluster.
> > >
> > > > > > > > >
> > >
> > > > > > > > > On Wed, Mar 21, 2018 at 11:33 AM, Renjie Liu <
> > >
> > > > > > [hidden email]>
> > >
> > > > > > > > > wrote:
> > >
> > > > > > > > >
> > >
> > > > > > > > > > For example, we have 2 jobs.
> > >
> > > > > > > > > > For job 1, I want to start job manger with 1 CPU and 100M
> > >
> > > > memory.
> > >
> > > > > > > Job 1
> > >
> > > > > > > > > > need s10 slots, and I want to deploy these 10 slots in 2
> > task
> > >
> > > > > > > managers,
> > >
> > > > > > > > > > each with 5 cores and 1G memory.
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > For job 2, I want to start job manager with 2 CPU and
> 200M
> > >
> > > > > memory.
> > >
> > > > > > > Job
> > >
> > > > > > > > 2
> > >
> > > > > > > > > > needs 100 slots and I want to deploy these 100 slot in 10
> > > task
> > >
> > > > > > > > managers,
> > >
> > > > > > > > > > each with 10 cores and 2G memory.
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > Is this possible?
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > On Wed, Mar 21, 2018 at 6:19 PM Till Rohrmann <
> > >
> > > > > > [hidden email]>
> > >
> > > > > > > > > > wrote:
> > >
> > > > > > > > > >
> > >
> > > > > > > > > > > Hi Renjie,
> > >
> > > > > > > > > > >
> > >
> > > > > > > > > > > what do you mean with specifying different JM and TM
> > >
> > > > resources
> > >
> > > > > > for
> > >
> > > > > > > > > > > different jobs exactly?
> > >
> > > > > > > > > > >
> > >
> > > > > > > > > > > Cheers,
> > >
> > > > > > > > > > > Till
> > >
> > > > > > > > > > >
> > >
> > > > > > > > > > > On Wed, Mar 21, 2018 at 10:55 AM, Renjie Liu <
> > >
> > > > > > > > [hidden email]>
> > >
> > > > > > > > > > > wrote:
> > >
> > > > > > > > > > >
> > >
> > > > > > > > > > > > Hi, Till:
> > >
> > > > > > > > > > > >
> > >
> > > > > > > > > > > > How to specify job manager and task manager resources
> > for
> > >
> > > > > > > different
> > >
> > > > > > > > > > jobs
> > >
> > > > > > > > > > > in
> > >
> > > > > > > > > > > > session mode?
> > >
> > > > > > > > > > > >
> > >
> > > > > > > > > > > > On Sun, Mar 18, 2018 at 1:10 AM Till Rohrmann <
> > >
> > > > > > > > [hidden email]>
> > >
> > > > > > > > > > > > wrote:
> > >
> > > > > > > > > > > >
> > >
> > > > > > > > > > > > > Hi Shuyi,
> > >
> > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > best if you look at the other e2e tests in the
> > >
> > > > > > > > > flink-end-to-end-tests
> > >
> > > > > > > > > > > > > module. For example the Kafka e2e test under
> > >
> > > > > > > > > > > > > flink/flink-end-to-end-tests/
> > >
> > > > test-scripts/test_streaming_
> > >
> > > > > > > > > > kafka010.sh.
> > >
> > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > Cheers,
> > >
> > > > > > > > > > > > > Till
> > >
> > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > On Fri, Mar 16, 2018 at 10:20 PM, Shuyi Chen <
> > >
> > > > > > > [hidden email]
> > >
> > > > > > > > >
> > >
> > > > > > > > > > > wrote:
> > >
> > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > Hi Till,
> > >
> > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > For FLINK-8562, the test is passing now because
> > it's
> > >
> > > > not
> > >
> > > > > > > really
> > >
> > > > > > > > > > > > > > checking the right thing.
> > >
> > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > Yes, I can help with the Kerberos integration
> > ticket.
> > >
> > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > Is there an example on how the e2e test should be
> > >
> > > > > > structured
> > >
> > > > > > > > and
> > >
> > > > > > > > > > > > invoked?
> > >
> > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > Thanks
> > >
> > > > > > > > > > > > > > Shuyi
> > >
> > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > On Fri, Mar 16, 2018 at 6:51 AM, Till Rohrmann <
> > >
> > > > > > > > > > [hidden email]
> > >
> > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > wrote:
> > >
> > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > Hi Shuyi,
> > >
> > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > thanks for the working on FLINK-8562. Once this
> > > issue
> > >
> > > > > is
> > >
> > > > > > > > fixed,
> > >
> > > > > > > > > > it
> > >
> > > > > > > > > > > > will
> > >
> > > > > > > > > > > > > > > automatically be executed on the Flip-6
> > components.
> > >
> > > > In
> > >
> > > > > > fact
> > >
> > > > > > > > it
> > >
> > > > > > > > > is
> > >
> > > > > > > > > > > > > already
> > >
> > > > > > > > > > > > > > > being executed on Flip-6.
> > >
> > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > But what you could help the community with is
> > > setting
> > >
> > > > > up
> > >
> > > > > > an
> > >
> > > > > > > > > > > automated
> > >
> > > > > > > > > > > > > > > end-to-end test for the Kerberos integration if
> > you
> > >
> > > > > want:
> > >
> > > > > > > > > > > > > > >
> https://issues.apache.org/jira/browse/FLINK-8981
> > .
> > >
> > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > The Flink community is currently working on
> > >
> > > > automating
> > >
> > > > > > more
> > >
> > > > > > > > and
> > >
> > > > > > > > > > > more
> > >
> > > > > > > > > > > > > > tests
> > >
> > > > > > > > > > > > > > > in order to facilitate faster releases and
> > improve
> > >
> > > > the
> > >
> > > > > > test
> > >
> > > > > > > > > > > coverage.
> > >
> > > > > > > > > > > > > You
> > >
> > > > > > > > > > > > > > > can find more about this effort here:
> > >
> > > > > > > > > > > > > > >
> https://issues.apache.org/jira/browse/FLINK-8970
> > .
> > >
> > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > Cheers,
> > >
> > > > > > > > > > > > > > > Till
> > >
> > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 8:45 PM, Shuyi Chen <
> > >
> > > > > > > > > [hidden email]>
> > >
> > > > > > > > > > > > > wrote:
> > >
> > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > Hi Till,
> > >
> > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > This is Shuyi :) Thanks a lot. In
> FLINK-8562, I
> > >
> > > > > already
> > >
> > > > > > > > sent
> > >
> > > > > > > > > a
> > >
> > > > > > > > > > PR
> > >
> > > > > > > > > > > > to
> > >
> > > > > > > > > > > > > > > > resolve the issue, your help to take a look
> > will
> > > be
> > >
> > > > > > > great.
> > >
> > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > Please let me know what I can help to test
> the
> > >
> > > > > Kerberos
> > >
> > > > > > > > > > > > > > authentication, I
> > >
> > > > > > > > > > > > > > > > am decently familiar with the Kerberos and
> YARN
> > >
> > > > > > security
> > >
> > > > > > > > part
> > >
> > > > > > > > > > in
> > >
> > > > > > > > > > > > > Flink.
> > >
> > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > As a starting point, I'd suggest to add an
> > >
> > > > > integration
> > >
> > > > > > > test
> > >
> > > > > > > > > > > similar
> > >
> > > > > > > > > > > > > to
> > >
> > > > > > > > > > > > > > > > YARNSessionFIFOSecuredITCase
> > >
> > > > > > > > > > > > > > > > for flip6.
> > >
> > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > Shuyi
> > >
> > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 5:44 AM, Till
> Rohrmann
> > <
> > >
> > > > > > > > > > > > [hidden email]
> > >
> > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > wrote:
> > >
> > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > > Hi Renjie,
> > >
> > > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > > thanks for the pointer with the
> > >
> > > > > > > > > YARNSessionFIFOSecuredITCase.
> > >
> > > > > > > > > > > > > You're
> > >
> > > > > > > > > > > > > > > > right
> > >
> > > > > > > > > > > > > > > > > that we should fix this test. There is
> > > FLINK-8562
> > >
> > > > > > which
> > >
> > > > > > > > > seems
> > >
> > > > > > > > > > > to
> > >
> > > > > > > > > > > > > > > address
> > >
> > > > > > > > > > > > > > > > > the problem. Will take a look.
> > >
> > > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > > Additionally, we want to test Kerberos
> > >
> > > > > authentication
> > >
> > > > > > > > > > > explicitly
> > >
> > > > > > > > > > > > as
> > >
> > > > > > > > > > > > > > > part
> > >
> > > > > > > > > > > > > > > > of
> > >
> > > > > > > > > > > > > > > > > the release testing for Flink 1.5. I will
> > > shortly
> > >
> > > > > > send
> > >
> > > > > > > > > > around a
> > >
> > > > > > > > > > > > > mail
> > >
> > > > > > > > > > > > > > > > where
> > >
> > > > > > > > > > > > > > > > > I will lay out the ongoing testing efforts
> > and
> > >
> > > > > where
> > >
> > > > > > > more
> > >
> > > > > > > > > is
> > >
> > > > > > > > > > > > > needed.
> > >
> > > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > > Cheers,
> > >
> > > > > > > > > > > > > > > > > Till
> > >
> > > > > > > > > > > > > > > > >
> > >
> > > > > > > > > > > > > > > > > On Thu, Mar 15, 2018 at 7:37 AM, Renjie
> Liu <
> > >
> > > > > > > > > > > > > [hidden email]
> > >
> > > > > > > > > > > > > > >
> > >
> > > > > > > > > > &
> >
> > --
> > Liu, Renjie
> > Software Engineer, MVAD
> >
>
--
Liu, Renjie
Software Engineer, MVAD
12