[DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode

RocMarshal
Hi all.
       Expect to have such a mode of submission. Build the job directly in the Environment, and then submit the job in yarn mode. Just like RemoteStreamEnvironment, as long as you specify the parameters of the yarn cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME, you can use the topology built by Env to submit the job .
       This submission method is best to minimize the transmission of resources required by yarn to start flink-jobmanager and taskmanagerrunner to ensure that flink can deploy job on the yarn cluster as quickly as possible.
The simple demo as shown in  the picture .the parameter named 'env' containes all the operators about job ,like sources,maps,etc..

Thank you for your attention.


 

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode

Jeff Zhang
Hi Roc,

You can try flink on zeppelin, where you can submit flink job to yarn
directly without starting flink cluster by yourself.  Here's a few
tutorials.

1) Get started https://link.medium.com/oppqD6dIg5
<https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
<https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>



Roc Marshal <[hidden email]> 于2020年5月2日周六 上午11:18写道:

> Hi all.
>        Expect to have such a mode of submission. Build the job directly in
> the Environment, and then submit the job in yarn mode. Just like
> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
> you can use the topology built by Env to submit the job .
>        This submission method is best to minimize the transmission of
> resources required by yarn to start flink-jobmanager and taskmanagerrunner
> to ensure that flink can deploy job on the yarn cluster as quickly as
> possible.
> The simple demo as shown in  the picture .the parameter named 'env'
> containes all the operators about job ,like sources,maps,etc..
>
> Thank you for your attention.
>
>
>
>


--
Best Regards

Jeff Zhang
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode

Aljoscha Krettek-2
Hi,

image attachments don't work on this ML. You will have to upload the
image somewhere and post a link.

Best,
Aljoscha

On 02.05.20 09:16, Jeff Zhang wrote:

> Hi Roc,
>
> You can try flink on zeppelin, where you can submit flink job to yarn
> directly without starting flink cluster by yourself.  Here's a few
> tutorials.
>
> 1) Get started https://link.medium.com/oppqD6dIg5
> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
> <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
> usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>
>
>
>
> Roc Marshal <[hidden email]> 于2020年5月2日周六 上午11:18写道:
>
>> Hi all.
>>         Expect to have such a mode of submission. Build the job directly in
>> the Environment, and then submit the job in yarn mode. Just like
>> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
>> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
>> you can use the topology built by Env to submit the job .
>>         This submission method is best to minimize the transmission of
>> resources required by yarn to start flink-jobmanager and taskmanagerrunner
>> to ensure that flink can deploy job on the yarn cluster as quickly as
>> possible.
>> The simple demo as shown in  the picture .the parameter named 'env'
>> containes all the operators about job ,like sources,maps,etc..
>>
>> Thank you for your attention.
>>
>>
>>
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re:Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode

RocMarshal
Hi,Aljoscha.<br/><br/>I have updated the JIRA according to your suggestion. Thank you very much.<br/><br/><br/>Best,<br/>Roc
At 2020-05-05 16:04:01, "Aljoscha Krettek" <[hidden email]> wrote:

>Hi,
>
>image attachments don't work on this ML. You will have to upload the
>image somewhere and post a link.
>
>Best,
>Aljoscha
>
>On 02.05.20 09:16, Jeff Zhang wrote:
>> Hi Roc,
>>
>> You can try flink on zeppelin, where you can submit flink job to yarn
>> directly without starting flink cluster by yourself.  Here's a few
>> tutorials.
>>
>> 1) Get started https://link.medium.com/oppqD6dIg5
>> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
>> <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
>> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
>> usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>
>>
>>
>>
>> Roc Marshal <[hidden email]> 于2020年5月2日周六 上午11:18写道:
>>
>>> Hi all.
>>>         Expect to have such a mode of submission. Build the job directly in
>>> the Environment, and then submit the job in yarn mode. Just like
>>> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
>>> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
>>> you can use the topology built by Env to submit the job .
>>>         This submission method is best to minimize the transmission of
>>> resources required by yarn to start flink-jobmanager and taskmanagerrunner
>>> to ensure that flink can deploy job on the yarn cluster as quickly as
>>> possible.
>>> The simple demo as shown in  the picture .the parameter named 'env'
>>> containes all the operators about job ,like sources,maps,etc..
>>>
>>> Thank you for your attention.
>>>
>>>
>>>
>>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode

Aljoscha Krettek-2
Could you post the Jira issue here? I don't see it mentioned in this
thread so far.

Best,
Aljoscha

On 05.05.20 12:32, Roc Marshal wrote:

> Hi,Aljoscha.<br/><br/>I have updated the JIRA according to your suggestion. Thank you very much.<br/><br/><br/>Best,<br/>Roc
> At 2020-05-05 16:04:01, "Aljoscha Krettek" <[hidden email]> wrote:
>> Hi,
>>
>> image attachments don't work on this ML. You will have to upload the
>> image somewhere and post a link.
>>
>> Best,
>> Aljoscha
>>
>> On 02.05.20 09:16, Jeff Zhang wrote:
>>> Hi Roc,
>>>
>>> You can try flink on zeppelin, where you can submit flink job to yarn
>>> directly without starting flink cluster by yourself.  Here's a few
>>> tutorials.
>>>
>>> 1) Get started https://link.medium.com/oppqD6dIg5
>>> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
>>> <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
>>> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
>>> usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>
>>>
>>>
>>>
>>> Roc Marshal <[hidden email]> 于2020年5月2日周六 上午11:18写道:
>>>
>>>> Hi all.
>>>>          Expect to have such a mode of submission. Build the job directly in
>>>> the Environment, and then submit the job in yarn mode. Just like
>>>> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
>>>> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
>>>> you can use the topology built by Env to submit the job .
>>>>          This submission method is best to minimize the transmission of
>>>> resources required by yarn to start flink-jobmanager and taskmanagerrunner
>>>> to ensure that flink can deploy job on the yarn cluster as quickly as
>>>> possible.
>>>> The simple demo as shown in  the picture .the parameter named 'env'
>>>> containes all the operators about job ,like sources,maps,etc..
>>>>
>>>> Thank you for your attention.
>>>>
>>>>
>>>>
>>>>
>>>
>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode

RocMarshal
Sorry, I confused JIRA with email.    
The  Attachment link: https://gitee.com/RocMarshal/resources4link/blob/master/README.md       
The  JIRA ID: FLINK-17472    
The JIRA link:  https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-17472

Best,
Roc



| |
Roc Marshal
|
|
邮箱:[hidden email]
|

签名由 网易邮箱大师 定制

On 05/05/2020 18:43, Aljoscha Krettek wrote:
Could you post the Jira issue here? I don't see it mentioned in this
thread so far.

Best,
Aljoscha

On 05.05.20 12:32, Roc Marshal wrote:

> Hi,Aljoscha.<br/><br/>I have updated the JIRA according to your suggestion. Thank you very much.<br/><br/><br/>Best,<br/>Roc
> At 2020-05-05 16:04:01, "Aljoscha Krettek" <[hidden email]> wrote:
>> Hi,
>>
>> image attachments don't work on this ML. You will have to upload the
>> image somewhere and post a link.
>>
>> Best,
>> Aljoscha
>>
>> On 02.05.20 09:16, Jeff Zhang wrote:
>>> Hi Roc,
>>>
>>> You can try flink on zeppelin, where you can submit flink job to yarn
>>> directly without starting flink cluster by yourself.  Here's a few
>>> tutorials.
>>>
>>> 1) Get started https://link.medium.com/oppqD6dIg5
>>> <https://t.co/PTouUYYTrv?amp=1&gt; 2) Batch https://link.medium.com/3qumbwRIg5
>>> <https://t.co/Yo9QAY0Joj?amp=1&gt; 3) Streaming https://
>>> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1&gt; 4) Advanced
>>> usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1&gt;
>>>
>>>
>>>
>>> Roc Marshal <[hidden email]> 于2020年5月2日周六 上午11:18写道:
>>>
>>>> Hi all.
>>>>          Expect to have such a mode of submission. Build the job directly in
>>>> the Environment, and then submit the job in yarn mode. Just like
>>>> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
>>>> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
>>>> you can use the topology built by Env to submit the job .
>>>>          This submission method is best to minimize the transmission of
>>>> resources required by yarn to start flink-jobmanager and taskmanagerrunner
>>>> to ensure that flink can deploy job on the yarn cluster as quickly as
>>>> possible.
>>>> The simple demo as shown in  the picture .the parameter named 'env'
>>>> containes all the operators about job ,like sources,maps,etc..
>>>>
>>>> Thank you for your attention.
>>>>
>>>>
>>>>
>>>>
>>>
>>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode

Yang Wang
Hi Roc Marshal,

I have a question about making the Yarn deployment ASAP. In my opinion,
using the "ExecutionEnvironment"
instead of "flink run -m yarn-cluster" to deploy a Flink cluster on Yarn do
not help to reduce the time cost. Since
we still need to ship the user jars, flink libs to the HDFS staging
directory and register as Yarn local resource.

If we want to achieve this, we need to use the pre-uploaded libs to avoid
the unnecessary uploading and
downloading. We already have a ticket for this[1].


[1].https://issues.apache.org/jira/browse/FLINK-13938


Best,
Yang


Roc Marshal <[hidden email]> 于2020年5月5日周二 下午7:26写道:

> Sorry, I confused JIRA with email.
> The  Attachment link:
> https://gitee.com/RocMarshal/resources4link/blob/master/README.md
> The  JIRA ID: FLINK-17472
> The JIRA link:
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-17472
>
> Best,
> Roc
>
>
>
> | |
> Roc Marshal
> |
> |
> 邮箱:[hidden email]
> |
>
> 签名由 网易邮箱大师 定制
>
> On 05/05/2020 18:43, Aljoscha Krettek wrote:
> Could you post the Jira issue here? I don't see it mentioned in this
> thread so far.
>
> Best,
> Aljoscha
>
> On 05.05.20 12:32, Roc Marshal wrote:
> > Hi,Aljoscha.<br/><br/>I have updated the JIRA according to your
> suggestion. Thank you very much.<br/><br/><br/>Best,<br/>Roc
> > At 2020-05-05 16:04:01, "Aljoscha Krettek" <[hidden email]> wrote:
> >> Hi,
> >>
> >> image attachments don't work on this ML. You will have to upload the
> >> image somewhere and post a link.
> >>
> >> Best,
> >> Aljoscha
> >>
> >> On 02.05.20 09:16, Jeff Zhang wrote:
> >>> Hi Roc,
> >>>
> >>> You can try flink on zeppelin, where you can submit flink job to yarn
> >>> directly without starting flink cluster by yourself.  Here's a few
> >>> tutorials.
> >>>
> >>> 1) Get started https://link.medium.com/oppqD6dIg5
> >>> <https://t.co/PTouUYYTrv?amp=1&gt; 2) Batch
> https://link.medium.com/3qumbwRIg5
> >>> <https://t.co/Yo9QAY0Joj?amp=1&gt; 3) Streaming https://
> >>> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1&gt; 4)
> Advanced
> >>> usage https://link.medium.com/CAekyoXIg5 <
> https://t.co/MXolULmafZ?amp=1&gt;
> >>>
> >>>
> >>>
> >>> Roc Marshal <[hidden email]> 于2020年5月2日周六 上午11:18写道:
> >>>
> >>>> Hi all.
> >>>>          Expect to have such a mode of submission. Build the job
> directly in
> >>>> the Environment, and then submit the job in yarn mode. Just like
> >>>> RemoteStreamEnvironment, as long as you specify the parameters of the
> yarn
> >>>> cluster (host, port) or yarn configuration directory and
> HADOOP_USER_NAME,
> >>>> you can use the topology built by Env to submit the job .
> >>>>          This submission method is best to minimize the transmission
> of
> >>>> resources required by yarn to start flink-jobmanager and
> taskmanagerrunner
> >>>> to ensure that flink can deploy job on the yarn cluster as quickly as
> >>>> possible.
> >>>> The simple demo as shown in  the picture .the parameter named 'env'
> >>>> containes all the operators about job ,like sources,maps,etc..
> >>>>
> >>>> Thank you for your attention.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
>