(DEPRECATED) Apache Flink Mailing List archive.

[DISCUSS] FLIP-108: Add GPU support in Flink

Classic

List

Threaded

35 messages Options

Stephan Ewen

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

This sounds good to go ahead from my side.

I like the approach that Becket suggested - in that case the core
abstraction that everyone would need to understand would be "external
resource allocation" and the "ResourceInfoProvider", and the GPU specific
code would be a specific implementation only known to that component that
allocates the external resource. That fits the separation of concerns well.

I also understand that it should not be over-engineered in the first
version, so some simplification makes sense, and then gradually expand from
there.

So +1 to go ahead with what was suggested above (Xintong / Becket) from my
side.

On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <[hidden email]> wrote:

> Thanks for the comments, Stephan & Becket.
>
> @Stephan
>
> I see your concern, and I completely agree with you that we should first
> think about the "library" / "plugin" / "extension" style if possible.
>
> If GPUs are sliced and assigned during scheduling, there may be reason,
> > although it looks that it would belong to the slot then. Is that what we
> > are doing here?
>
>
> In the current proposal, we do not have the GPUs sliced and assigned to
> slots, because it could be problematic without dynamic slot allocation.
> E.g., the number of GPUs might not be evenly divisible by the number of
> slots.
>
> I think it makes sense to eventually have the GPUs assigned to slots. Even
> then, we might still need a TM level GPUManager (or ResourceProvider like
> Becket suggested). For memory, in each slot we can simply request the
> amount of memory, leaving it to JVM / OS to decide which memory (address)
> should be assigned. For GPU, and potentially other resources like FPGA, we
> need to explicitly specify which GPU (index) should be used. Therefore, we
> need some component at the TM level to coordinate which slot uses which
> GPU.
>
> IMO, unless we say Flink will not support slot-level GPU slicing at least
> in the foreseeable future, I don't see a good way to avoid touching the TM
> core. To that end, I think Becket's suggestion points to a good direction,
> that supports more features (GPU, FPGA, etc.) with less coupling to the TM
> core (only needs to understand the general interfaces). The detailed
> implementation for specific resource types can even be encapsulated as a
> library.
>
> @Becket
>
> Thanks for sharing your thought on the final state. Despite the details how
> the interfaces should look like, I think this is a really good abstraction
> for supporting general resource types.
>
> I'd like to further clarify that, the following three things are all that
> the "Flink core" needs to understand.
>
> - The *amount* of resource, for scheduling. Actually, we already have
> the Resource class in ResourceProfile and ResourceSpec for extended
> resource. It's just not really used.
> - The *info*, that Flink provides to the operators / user codes.
> - The *provider*, which generates the info based on the amount.
>
> The "core" does not need to understand the specific implementation details
> of the above three. They can even be implemented in a 3rd-party library.
> Similar to how we allow users to define their custom MetricReporter.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <[hidden email]> wrote:
>
> > Thanks for the comment, Stephan.
> >
> > - If everything becomes a "core feature", it will make the project hard
> > > to develop in the future. Thinking "library" / "plugin" / "extension"
> > style
> > > where possible helps.
> >
> >
> > Completely agree. It is much more important to design a mechanism than
> > focusing on a specific case. Here is what I am thinking to fully support
> > custom resource management:
> > 1. On the JM / RM side, use ResourceProfile and ResourceSpec to define
> the
> > resource and the amount required. They will be used to find suitable TMs
> > slots to run the tasks. At this point, the resources are only measured by
> > amount, i.e. they do not have individual ID.
> >
> > 2. On the TM side, have something like *"ResourceInfoProvider"* to
> identify
> > and provides the detail information of the individual resource, e.g. GPU
> > ID.. It is important because the operator may have to explicitly interact
> > with the physical resource it uses. The ResourceInfoProvider might look
> > like something below.
> > interface ResourceInfoProvider<INFO> {
> > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId opId,
> > ResourceProfile resourceProfile);
> > }
> >
> > - There could be several "*ResourceInfoProvider*" configured on the TM to
> > retrieve the information for different resources.
> > - The TM will be responsible to assign those individual resources to each
> > operator according to their requested amount.
> > - The operators will be able to get the ResourceInfo from their
> > RuntimeContext.
> >
> > If we agree this is a reasonable final state. We can adapt the current
> FLIP
> > to it. In fact it does not sound a big change to me. All the proposed
> > configuration can be as is, it is just that Flink itself won't care about
> > them, instead a GPUInfoProviver implementing the ResourceInfoProvider
> will
> > use them.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <[hidden email]> wrote:
> >
> > > Hi all!
> > >
> > > The main point I wanted to throw into the discussion is the following:
> > > - With more and more use cases, more and more tools go into Flink
> > > - If everything becomes a "core feature", it will make the project
> hard
> > > to develop in the future. Thinking "library" / "plugin" / "extension"
> > style
> > > where possible helps.
> > >
> > > - A good thought experiment is always: How many future developers
> have
> > to
> > > interact with this code (and possibly understand it partially), even if
> > the
> > > features they touch have nothing to do with GPU support. If many
> > > contributors to unrelated features will have to touch it and understand
> > it,
> > > then let's think if there is a different solution. Maybe there is not,
> > but
> > > then we should be sure why.
> > >
> > > - That led me to raising this issue: If the GPU manager becomes a
> core
> > > service in the TaskManager, Environment, RuntimeContext, etc. then
> > everyone
> > > developing TM and streaming tasks need to understand the GPU manager.
> > That
> > > seems oddly specific, is my impression.
> > >
> > > Access to configuration seems not the right reason to do that. We
> should
> > > expose the Flink configuration from the RuntimeContext anyways.
> > >
> > > If GPUs are sliced and assigned during scheduling, there may be reason,
> > > although it looks that it would belong to the slot then. Is that what
> we
> > > are doing here?
> > >
> > > Best,
> > > Stephan
> > >
> > >
> > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <[hidden email]>
> > > wrote:
> > >
> > > > Thanks for the feedback, Becket.
> > > >
> > > > IMO, eventually an operator should only see info of GPUs that are
> > > dedicated
> > > > for it, instead of all GPUs on the machine/container in the current
> > > design.
> > > > It does not make sense to let the user who writes a UDF to worry
> about
> > > > coordination among multiple operators running on the same machine.
> And
> > if
> > > > we want to limit the GPU info an operator sees, we should not let the
> > > > operator to instantiate GPUManager, which means we have to expose
> > > something
> > > > through runtime context, either GPU info or some kind of limited
> access
> > > to
> > > > the GPUManager.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <[hidden email]>
> > wrote:
> > > >
> > > > > It probably make sense for us to first agree on the final state.
> More
> > > > > specifically, will the resource info be exposed through runtime
> > context
> > > > > eventually?
> > > > >
> > > > > If that is the final state and we have a seamless migration story
> > from
> > > > this
> > > > > FLIP to that final state, Personally I think it is OK to expose the
> > GPU
> > > > > info in the runtime context.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> [hidden email]
> > >
> > > > > wrote:
> > > > >
> > > > > > @Yangze,
> > > > > > I think what Stephan means (@Stephan, please correct me if I'm
> > wrong)
> > > > is
> > > > > > that, we might not need to hold and maintain the GPUManager as a
> > > > service
> > > > > in
> > > > > > TaskManagerServices or RuntimeContext. An alternative is to
> create
> > /
> > > > > > retrieve the GPUManager only in the operators that need it, e.g.,
> > > with
> > > > a
> > > > > > static method `GPUManager.get()`.
> > > > > >
> > > > > > @Stephan,
> > > > > > I agree with you on excluding GPUManager from
> TaskManagerServices.
> > > > > >
> > > > > > - For the first step, where we provide unified TM-level GPU
> > > > > information
> > > > > > to all operators, it should be fine to have operators access /
> > > > > > lazy-initiate GPUManager by themselves.
> > > > > > - In future, we might have some more fine-grained GPU
> > management,
> > > > > where
> > > > > > we need to maintain GPUManager as a service and put GPU info
> in
> > > slot
> > > > > > profiles. But at least for now it's not necessary to introduce
> > > such
> > > > > > complexity.
> > > > > >
> > > > > > However, I have some concerns on excluding GPUManager from
> > > > RuntimeContext
> > > > > > and let operators access it directly.
> > > > > >
> > > > > > - Configurations needed for creating the GPUManager is not
> > always
> > > > > > available for operators.
> > > > > > - If later we want to have fine-grained control over GPU
> (e.g.,
> > > > > > operators in each slot can only see GPUs reserved for that
> > slot),
> > > > the
> > > > > > approach cannot be easily extended.
> > > > > >
> > > > > > I would suggest to wrap the GPUManager behind RuntimeContext and
> > only
> > > > > > expose the GPUInfo to users. For now, we can declare a method
> > > > > > `getGPUInfo()` in RuntimeContext, with a default definition that
> > > calls
> > > > > > `GPUManager.get()` to get the lazily-created GPUManager. If later
> > we
> > > > want
> > > > > > to create / retrieve GPUManager in a different way, we can simply
> > > > change
> > > > > > how `getGPUInfo` is implemented, without needing to change any
> > public
> > > > > > interfaces.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <[hidden email]>
> > > > wrote:
> > > > > >
> > > > > > > @Shephan
> > > > > > > Do you mean Minicluster? Yes, it makes sense to share the GPU
> > > Manager
> > > > > > > in such scenario.
> > > > > > > If that's what you worry about, I'm +1 for holding
> > > > > > > GPUManager(ExternalResourceManagers) in TaskExecutor instead of
> > > > > > > TaskManagerServices.
> > > > > > >
> > > > > > > Regarding the RuntimeContext/FunctionContext, it just holds the
> > GPU
> > > > > > > info instead of the GPU Manager. AFAIK, it's the only place we
> > > could
> > > > > > > pass GPU info to the RichFunction/UserDefinedFunction.
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried <
> > > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000 [hidden email]
> wrote
> > > > ----
> > > > > > > >
> > > > > > > > > > Can we somehow keep this out of the TaskManager services
> > > > > > > > > I fear that we could not. IMO, the GPUManager(or
> > > > > > > > > ExternalServicesManagers in future) is conceptually one of
> > the
> > > > task
> > > > > > > > > manager services, just like MemoryManager before 1.10.
> > > > > > > > > - It maintains/holds the GPU resource at TM level and all
> of
> > > the
> > > > > > > > > operators allocate the GPU resources from it. So, it should
> > be
> > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > - We could add a collection called ExternalResourceManagers
> > to
> > > > hold
> > > > > > > > > all managers of other external resources in the future.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Can you help me understand why this needs the addition in
> > > > > > > TaskMagerServices
> > > > > > > > or in the RuntimeContext?
> > > > > > > > Are you worried about the case when multiple Task Executors
> run
> > > in
> > > > > the
> > > > > > > same
> > > > > > > > JVM? That's not common, but wouldn't it actually be good in
> > that
> > > > case
> > > > > > to
> > > > > > > > share the GPU Manager, given that the GPU is shared?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Stephan
> > > > > > > >
> > > > > > > > ---------------------------
> > > > > > > >
> > > > > > > >
> > > > > > > > > What parts need information about this?
> > > > > > > > > In this FLIP, operators need the information. Thus, we
> expose
> > > GPU
> > > > > > > > > information to the RuntimeContext/FunctionContext. The slot
> > > > profile
> > > > > > is
> > > > > > > > > not aware of GPU resources as GPU is TM level resource now.
> > > > > > > > >
> > > > > > > > > > Can the GPU Manager be a "self contained" thing that
> simply
> > > > takes
> > > > > > the
> > > > > > > > > configuration, and then abstracts everything internally?
> > > > > > > > > Yes, we just pass the path/args of the discover script and
> > how
> > > > many
> > > > > > > > > GPUs per TM to it. It takes the responsibility to get the
> GPU
> > > > > > > > > information and expose them to the
> > > RuntimeContext/FunctionContext
> > > > > of
> > > > > > > > > Operators. Meanwhile, we'd better not allow operators to
> > > directly
> > > > > > > > > access GPUManager, it should get what they want from
> Context.
> > > We
> > > > > > could
> > > > > > > > > then decouple the interface/implementation of GPUManager
> and
> > > > Public
> > > > > > > > > API.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <
> > [hidden email]
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > It sounds fine to initially start with GPU specific
> support
> > > and
> > > > > > think
> > > > > > > > > about
> > > > > > > > > > generalizing this once we better understand the space.
> > > > > > > > > >
> > > > > > > > > > About the implementation suggested in FLIP-108:
> > > > > > > > > > - Can we somehow keep this out of the TaskManager
> services?
> > > > > > Anything
> > > > > > > we
> > > > > > > > > > have to pull through all layers of the TM makes the TM
> > > > components
> > > > > > yet
> > > > > > > > > more
> > > > > > > > > > complex and harder to maintain.
> > > > > > > > > >
> > > > > > > > > > - What parts need information about this?
> > > > > > > > > > -> do the slot profiles need information about the GPU?
> > > > > > > > > > -> Can the GPU Manager be a "self contained" thing that
> > > simply
> > > > > > takes
> > > > > > > > > > the configuration, and then abstracts everything
> > internally?
> > > > > > > Operators
> > > > > > > > > can
> > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > >
> > > > > > > > > > > @Becket
> > > > > > > > > > > Regarding the WebUI and GPUInfo, you're right, I'll add
> > > them
> > > > to
> > > > > > the
> > > > > > > > > > > Public API section.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > Regarding the general extended resource mechanism, I
> > second
> > > > > > > Xintong's
> > > > > > > > > > > suggestion.
> > > > > > > > > > > - It's better to leverage ResourceProfile and
> > ResourceSpec
> > > > > after
> > > > > > we
> > > > > > > > > > > supporting fine-grained GPU scheduling. As a first step
> > > > > > proposal, I
> > > > > > > > > > > prefer to not include it in the scope of this FLIP.
> > > > > > > > > > > - Regarding the "Extended Resource Manager", if I
> > > understand
> > > > > > > > > > > correctly, it just a code refactoring atm, we could
> > extract
> > > > the
> > > > > > > > > > > open/close/allocateExtendResources of GPUManager to
> that
> > > > > > > interface. If
> > > > > > > > > > > that is the case, +1 to do it during implementation.
> > > > > > > > > > >
> > > > > > > > > > > @Xingbo
> > > > > > > > > > > As Xintong said, we looked into how Spark supports a
> > > general
> > > > > > > "Custom
> > > > > > > > > > > Resource Scheduling" before and decided to introduce a
> > > common
> > > > > > > resource
> > > > > > > > > > > configuration
> > > > > > > > > > >
> > > > > >
> schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > to make it more extensible. I think the "resource" is a
> > > > proper
> > > > > > > level
> > > > > > > > > > > to contain all the configs of extended resources.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo Huang <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > >
> > > > > > > > > > > > There is no doubt that GPU resource management
> support
> > > will
> > > > > > > greatly
> > > > > > > > > > > > facilitate the development of AI-related applications
> > by
> > > > > > PyFlink
> > > > > > > > > users.
> > > > > > > > > > > >
> > > > > > > > > > > > I have only one comment about this wiki:
> > > > > > > > > > > >
> > > > > > > > > > > > Regarding the names of several GPU configurations, I
> > > think
> > > > it
> > > > > > is
> > > > > > > > > better
> > > > > > > > > > > to
> > > > > > > > > > > > delete the resource field makes it consistent with
> the
> > > > names
> > > > > of
> > > > > > > other
> > > > > > > > > > > > resource-related configurations in TaskManagerOption.
> > > > > > > > > > > >
> > > > > > > > > > > > e.g. taskmanager.resource.gpu.discovery-script.path
> ->
> > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > >
> > > > > > > > > > > > Xingbo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Xintong Song <[hidden email]> 于2020年3月4日周三
> > > > 上午10:39写道：
> > > > > > > > > > > >
> > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Actually, Yangze, Yang and I also had an offline
> > > > discussion
> > > > > > > about
> > > > > > > > > > > making
> > > > > > > > > > > > > the "GPU Support" as some general "Extended
> Resource
> > > > > > Support".
> > > > > > > We
> > > > > > > > > > > believe
> > > > > > > > > > > > > supporting extended resources in a general
> mechanism
> > is
> > > > > > > definitely
> > > > > > > > > a
> > > > > > > > > > > good
> > > > > > > > > > > > > and extensible way. The reason we propose this FLIP
> > > > > narrowing
> > > > > > > its
> > > > > > > > > scope
> > > > > > > > > > > > > down to GPU alone, is mainly for the concern on
> extra
> > > > > efforts
> > > > > > > and
> > > > > > > > > > > review
> > > > > > > > > > > > > capacity needed for a general mechanism.
> > > > > > > > > > > > >
> > > > > > > > > > > > > To come up with a well design on a general extended
> > > > > resource
> > > > > > > > > management
> > > > > > > > > > > > > mechanism, we would need to investigate more on how
> > > > people
> > > > > > use
> > > > > > > > > > > different
> > > > > > > > > > > > > kind of resources in practice. For GPU, we learnt
> > such
> > > > > > > knowledge
> > > > > > > > > from
> > > > > > > > > > > the
> > > > > > > > > > > > > experts, Becket and his team members. But for FPGA,
> > or
> > > > > other
> > > > > > > > > potential
> > > > > > > > > > > > > extended resources, we don't have such convenient
> > > > > information
> > > > > > > > > sources,
> > > > > > > > > > > > > making the investigation requires more efforts,
> > which I
> > > > > tend
> > > > > > to
> > > > > > > > > think
> > > > > > > > > > > is
> > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On the other hand, we also looked into how Spark
> > > > supports a
> > > > > > > general
> > > > > > > > > > > "Custom
> > > > > > > > > > > > > Resource Scheduling". Assuming we want to have a
> > > similar
> > > > > > > general
> > > > > > > > > > > extended
> > > > > > > > > > > > > resource mechanism in the future, we believe that
> the
> > > > > current
> > > > > > > GPU
> > > > > > > > > > > support
> > > > > > > > > > > > > design can be easily extended, in an incremental
> way
> > > > > without
> > > > > > > too
> > > > > > > > > many
> > > > > > > > > > > > > reworks.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - The most important part is probably user
> > interfaces.
> > > > > Spark
> > > > > > > > > offers
> > > > > > > > > > > > > configuration options to define the amount,
> discovery
> > > > > script
> > > > > > > and
> > > > > > > > > > > vendor
> > > > > > > > > > > > > (on
> > > > > > > > > > > > > k8s) in a per resource type bias [1], which is very
> > > > similar
> > > > > > to
> > > > > > > > > what
> > > > > > > > > > > we
> > > > > > > > > > > > > proposed in this FLIP. I think it's not necessary
> to
> > > > expose
> > > > > > > > > config
> > > > > > > > > > > > > options
> > > > > > > > > > > > > in the general way atm, since we do not have
> supports
> > > for
> > > > > > other
> > > > > > > > > > > resource
> > > > > > > > > > > > > types now. If later we decided to have per resource
> > > type
> > > > > > config
> > > > > > > > > > > > > options, we
> > > > > > > > > > > > > can have backwards compatibility on the current
> > > proposed
> > > > > > > options
> > > > > > > > > > > with
> > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > - For the GPU Manager, if later needed we can
> change
> > it
> > > > to
> > > > > a
> > > > > > > > > > > "Extended
> > > > > > > > > > > > > Resource Manager" (or whatever it is called). That
> > > should
> > > > > be
> > > > > > a
> > > > > > > > > pure
> > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > - For ResourceProfile and ResourceSpec, there are
> > > already
> > > > > > > > > fields for
> > > > > > > > > > > > > general extended resource. We can of course
> leverage
> > > them
> > > > > > when
> > > > > > > > > > > > > supporting
> > > > > > > > > > > > > fine grained GPU scheduling. That is also not in
> the
> > > > scope
> > > > > of
> > > > > > > > > this
> > > > > > > > > > > first
> > > > > > > > > > > > > step proposal, and would require FLIP-56 to be
> > finished
> > > > > > first.
> > > > > > > > > > > > >
> > > > > > > > > > > > > To summary up, I agree with Becket that have a
> > separate
> > > > > FLIP
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > > general extended resource mechanism, and keep it in
> > > mind
> > > > > when
> > > > > > > > > > > discussing
> > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1]
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket Qin <
> > > > > > > [hidden email]>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > That's a good point, Stephan. It makes total
> sense
> > to
> > > > > > > generalize
> > > > > > > > > the
> > > > > > > > > > > > > > resource management to support custom resources.
> > > Having
> > > > > > that
> > > > > > > > > allows
> > > > > > > > > > > users
> > > > > > > > > > > > > > to add new resources by themselves. The general
> > > > resource
> > > > > > > > > management
> > > > > > > > > > > may
> > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1. The custom resource type definition. It is
> > > supported
> > > > > by
> > > > > > > the
> > > > > > > > > > > extended
> > > > > > > > > > > > > > resources in ResourceProfile and ResourceSpec.
> This
> > > > will
> > > > > > > likely
> > > > > > > > > cover
> > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. The custom resource allocation logic, i.e. how
> > to
> > > > > assign
> > > > > > > the
> > > > > > > > > > > resources
> > > > > > > > > > > > > > to different tasks, operators, and so on. This
> may
> > > > > require
> > > > > > > two
> > > > > > > > > > > levels /
> > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > a. Subtask level - make sure the subtasks are put
> > > into
> > > > > > > > > suitable
> > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > It is done by the global RM and is not
> customizable
> > > > right
> > > > > > > now.
> > > > > > > > > > > > > > b. Operator level - map the exact resource to the
> > > > > operators
> > > > > > > > > in
> > > > > > > > > > > TM.
> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for operator B. This
> > step
> > > > is
> > > > > > > needed
> > > > > > > > > > > assuming
> > > > > > > > > > > > > > the global RM does not distinguish individual
> > > resources
> > > > > of
> > > > > > > the
> > > > > > > > > same
> > > > > > > > > > > type.
> > > > > > > > > > > > > > It is true for memory, but not for GPU.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The GPU manager is designed to do 2.b here. So it
> > > > should
> > > > > > > > > discover the
> > > > > > > > > > > > > > physical GPU information and bind/match them to
> > each
> > > > > > > operators.
> > > > > > > > > > > Making
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > general will fill in the missing piece to support
> > > > custom
> > > > > > > resource
> > > > > > > > > > > type
> > > > > > > > > > > > > > definition. But I'd avoid calling it a "External
> > > > Resource
> > > > > > > > > Manager" to
> > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > confusion with RM, maybe something like "Operator
> > > > > Resource
> > > > > > > > > Assigner"
> > > > > > > > > > > > > would
> > > > > > > > > > > > > > be more accurate. So for each resource type users
> > can
> > > > > have
> > > > > > an
> > > > > > > > > > > optional
> > > > > > > > > > > > > > "Operator Resource Assigner" in the TM. For
> memory,
> > > > users
> > > > > > > don't
> > > > > > > > > need
> > > > > > > > > > > > > this,
> > > > > > > > > > > > > > but for other extended resources, users may need
> > > that.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Personally I think a pluggable "Operator Resource
> > > > > Assigner"
> > > > > > > is
> > > > > > > > > > > achievable
> > > > > > > > > > > > > > in this FLIP. But I am also OK with having that
> in
> > a
> > > > > > separate
> > > > > > > > > FLIP
> > > > > > > > > > > > > because
> > > > > > > > > > > > > > the interface between the "Operator Resource
> > > Assigner"
> > > > > and
> > > > > > > > > operator
> > > > > > > > > > > may
> > > > > > > > > > > > > > take a while to settle down if we want to make it
> > > > > generic.
> > > > > > > But I
> > > > > > > > > > > think
> > > > > > > > > > > > > our
> > > > > > > > > > > > > > implementation should take this future work into
> > > > > > > consideration so
> > > > > > > > > > > that we
> > > > > > > > > > > > > > don't need to break backwards compatibility once
> we
> > > > have
> > > > > > > that.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM Stephan Ewen <
> > > > > > > [hidden email]>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you for writing this FLIP.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I cannot really give much input into the
> > mechanics
> > > of
> > > > > > > GPU-aware
> > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > and GPU allocation, as I have no experience
> with
> > > > that.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > One thought I had when reading the proposal is
> if
> > > it
> > > > > > makes
> > > > > > > > > sense to
> > > > > > > > > > > > > look
> > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > the "GPU Manager" as an "External Resource
> > > Manager",
> > > > > and
> > > > > > > GPU
> > > > > > > > > is one
> > > > > > > > > > > > > such
> > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > The way I understand the ResourceProfile and
> > > > > > ResourceSpec,
> > > > > > > > > that is
> > > > > > > > > > > how
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > It has the advantage that it looks more
> > extensible.
> > > > > Maybe
> > > > > > > > > there is
> > > > > > > > > > > a
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU Resource,
> and
> > > FPGA
> > > > > > > > > Resource, a
> > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM Becket Qin <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU resource
> > > management
> > > > > > > support
> > > > > > > > > is a
> > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > for machine learning use cases. Actually it
> is
> > > one
> > > > of
> > > > > > the
> > > > > > > > > mostly
> > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > question from the users who are interested in
> > > using
> > > > > > Flink
> > > > > > > > > for ML.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Some quick comments / questions to the wiki.
> > > > > > > > > > > > > > > > 1. The WebUI / REST API should probably also
> be
> > > > > > > mentioned in
> > > > > > > > > the
> > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > 2. Is the data structure that holds GPU info
> > > also a
> > > > > > > public
> > > > > > > > > API?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM Xintong Song
> <
> > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for drafting the FLIP and kicking
> off
> > > the
> > > > > > > > > discussion,
> > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Big +1 for this feature. Supporting using
> of
> > > GPU
> > > > in
> > > > > > > Flink
> > > > > > > > > is
> > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > especially for the ML scenarios.
> > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki doc and it
> looks
> > > good
> > > > > to
> > > > > > > me. I
> > > > > > > > > > > think
> > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > very good first step for Flink's GPU
> > supports.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM Yangze Guo
> <
> > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > We would like to start a discussion
> thread
> > on
> > > > > > > "FLIP-108:
> > > > > > > > > Add
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > This FLIP mainly discusses the following
> > > > issues:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > - Enable user to configure how many GPUs
> > in a
> > > > > task
> > > > > > > > > executor
> > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > forward such requirements to the external
> > > > > resource
> > > > > > > > > managers
> > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos setups).
> > > > > > > > > > > > > > > > > > - Provide information of available GPU
> > > > resources
> > > > > to
> > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Key changes proposed in the FLIP are as
> > > > follows:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > - Forward GPU resource requirements to
> > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > - Introduce GPUManager as one of the task
> > > > manager
> > > > > > > > > services to
> > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > and expose GPU resource information to
> the
> > > > > context
> > > > > > of
> > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > - Introduce the default script for GPU
> > > > discovery,
> > > > > > in
> > > > > > > > > which we
> > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > the privilege mode to help user to
> achieve
> > > > > > > worker-level
> > > > > > > > > > > isolation
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Please find more details in the FLIP wiki
> > > > > document
> > > > > > > [1].
> > > > > > > > > > > Looking
> > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Yangze Guo

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Thanks for the suggestion, @Stephan, @Becket and @Xintong.

I've updated the FLIP accordingly. I do not add a
ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
which takes the responsibility of all relevant operations on both RM
and TM sides.
After a rethink about decoupling the management of external resources
from TaskExecutor, I think we could do the same thing on the
ResourceManager side. We do not need to add a specific allocation
logic to the ResourceManager each time we add a specific external
resource.
- For Yarn, we need the ExternalResourceDriver to edit the containerRequest.
- For Kubenetes, ExternalResourceDriver could provide a decorator for
the TM pod.

In this way, just like MetricReporter, we allow users to define their
custom ExternalResourceDriver. It is more extensible and fits the
separation of concerns. For more details, please take a look at [1].

[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink

Best,
Yangze Guo

On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <[hidden email]> wrote:

>
> This sounds good to go ahead from my side.
>
> I like the approach that Becket suggested - in that case the core
> abstraction that everyone would need to understand would be "external
> resource allocation" and the "ResourceInfoProvider", and the GPU specific
> code would be a specific implementation only known to that component that
> allocates the external resource. That fits the separation of concerns well.
>
> I also understand that it should not be over-engineered in the first
> version, so some simplification makes sense, and then gradually expand from
> there.
>
> So +1 to go ahead with what was suggested above (Xintong / Becket) from my
> side.
>
> On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <[hidden email]> wrote:
>
> > Thanks for the comments, Stephan & Becket.
> >
> > @Stephan
> >
> > I see your concern, and I completely agree with you that we should first
> > think about the "library" / "plugin" / "extension" style if possible.
> >
> > If GPUs are sliced and assigned during scheduling, there may be reason,
> > > although it looks that it would belong to the slot then. Is that what we
> > > are doing here?
> >
> >
> > In the current proposal, we do not have the GPUs sliced and assigned to
> > slots, because it could be problematic without dynamic slot allocation.
> > E.g., the number of GPUs might not be evenly divisible by the number of
> > slots.
> >
> > I think it makes sense to eventually have the GPUs assigned to slots. Even
> > then, we might still need a TM level GPUManager (or ResourceProvider like
> > Becket suggested). For memory, in each slot we can simply request the
> > amount of memory, leaving it to JVM / OS to decide which memory (address)
> > should be assigned. For GPU, and potentially other resources like FPGA, we
> > need to explicitly specify which GPU (index) should be used. Therefore, we
> > need some component at the TM level to coordinate which slot uses which
> > GPU.
> >
> > IMO, unless we say Flink will not support slot-level GPU slicing at least
> > in the foreseeable future, I don't see a good way to avoid touching the TM
> > core. To that end, I think Becket's suggestion points to a good direction,
> > that supports more features (GPU, FPGA, etc.) with less coupling to the TM
> > core (only needs to understand the general interfaces). The detailed
> > implementation for specific resource types can even be encapsulated as a
> > library.
> >
> > @Becket
> >
> > Thanks for sharing your thought on the final state. Despite the details how
> > the interfaces should look like, I think this is a really good abstraction
> > for supporting general resource types.
> >
> > I'd like to further clarify that, the following three things are all that
> > the "Flink core" needs to understand.
> >
> > - The *amount* of resource, for scheduling. Actually, we already have
> > the Resource class in ResourceProfile and ResourceSpec for extended
> > resource. It's just not really used.
> > - The *info*, that Flink provides to the operators / user codes.
> > - The *provider*, which generates the info based on the amount.
> >
> > The "core" does not need to understand the specific implementation details
> > of the above three. They can even be implemented in a 3rd-party library.
> > Similar to how we allow users to define their custom MetricReporter.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <[hidden email]> wrote:
> >
> > > Thanks for the comment, Stephan.
> > >
> > > - If everything becomes a "core feature", it will make the project hard
> > > > to develop in the future. Thinking "library" / "plugin" / "extension"
> > > style
> > > > where possible helps.
> > >
> > >
> > > Completely agree. It is much more important to design a mechanism than
> > > focusing on a specific case. Here is what I am thinking to fully support
> > > custom resource management:
> > > 1. On the JM / RM side, use ResourceProfile and ResourceSpec to define
> > the
> > > resource and the amount required. They will be used to find suitable TMs
> > > slots to run the tasks. At this point, the resources are only measured by
> > > amount, i.e. they do not have individual ID.
> > >
> > > 2. On the TM side, have something like *"ResourceInfoProvider"* to
> > identify
> > > and provides the detail information of the individual resource, e.g. GPU
> > > ID.. It is important because the operator may have to explicitly interact
> > > with the physical resource it uses. The ResourceInfoProvider might look
> > > like something below.
> > > interface ResourceInfoProvider<INFO> {
> > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId opId,
> > > ResourceProfile resourceProfile);
> > > }
> > >
> > > - There could be several "*ResourceInfoProvider*" configured on the TM to
> > > retrieve the information for different resources.
> > > - The TM will be responsible to assign those individual resources to each
> > > operator according to their requested amount.
> > > - The operators will be able to get the ResourceInfo from their
> > > RuntimeContext.
> > >
> > > If we agree this is a reasonable final state. We can adapt the current
> > FLIP
> > > to it. In fact it does not sound a big change to me. All the proposed
> > > configuration can be as is, it is just that Flink itself won't care about
> > > them, instead a GPUInfoProviver implementing the ResourceInfoProvider
> > will
> > > use them.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <[hidden email]> wrote:
> > >
> > > > Hi all!
> > > >
> > > > The main point I wanted to throw into the discussion is the following:
> > > > - With more and more use cases, more and more tools go into Flink
> > > > - If everything becomes a "core feature", it will make the project
> > hard
> > > > to develop in the future. Thinking "library" / "plugin" / "extension"
> > > style
> > > > where possible helps.
> > > >
> > > > - A good thought experiment is always: How many future developers
> > have
> > > to
> > > > interact with this code (and possibly understand it partially), even if
> > > the
> > > > features they touch have nothing to do with GPU support. If many
> > > > contributors to unrelated features will have to touch it and understand
> > > it,
> > > > then let's think if there is a different solution. Maybe there is not,
> > > but
> > > > then we should be sure why.
> > > >
> > > > - That led me to raising this issue: If the GPU manager becomes a
> > core
> > > > service in the TaskManager, Environment, RuntimeContext, etc. then
> > > everyone
> > > > developing TM and streaming tasks need to understand the GPU manager.
> > > That
> > > > seems oddly specific, is my impression.
> > > >
> > > > Access to configuration seems not the right reason to do that. We
> > should
> > > > expose the Flink configuration from the RuntimeContext anyways.
> > > >
> > > > If GPUs are sliced and assigned during scheduling, there may be reason,
> > > > although it looks that it would belong to the slot then. Is that what
> > we
> > > > are doing here?
> > > >
> > > > Best,
> > > > Stephan
> > > >
> > > >
> > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <[hidden email]>
> > > > wrote:
> > > >
> > > > > Thanks for the feedback, Becket.
> > > > >
> > > > > IMO, eventually an operator should only see info of GPUs that are
> > > > dedicated
> > > > > for it, instead of all GPUs on the machine/container in the current
> > > > design.
> > > > > It does not make sense to let the user who writes a UDF to worry
> > about
> > > > > coordination among multiple operators running on the same machine.
> > And
> > > if
> > > > > we want to limit the GPU info an operator sees, we should not let the
> > > > > operator to instantiate GPUManager, which means we have to expose
> > > > something
> > > > > through runtime context, either GPU info or some kind of limited
> > access
> > > > to
> > > > > the GPUManager.
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <[hidden email]>
> > > wrote:
> > > > >
> > > > > > It probably make sense for us to first agree on the final state.
> > More
> > > > > > specifically, will the resource info be exposed through runtime
> > > context
> > > > > > eventually?
> > > > > >
> > > > > > If that is the final state and we have a seamless migration story
> > > from
> > > > > this
> > > > > > FLIP to that final state, Personally I think it is OK to expose the
> > > GPU
> > > > > > info in the runtime context.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > [hidden email]
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > @Yangze,
> > > > > > > I think what Stephan means (@Stephan, please correct me if I'm
> > > wrong)
> > > > > is
> > > > > > > that, we might not need to hold and maintain the GPUManager as a
> > > > > service
> > > > > > in
> > > > > > > TaskManagerServices or RuntimeContext. An alternative is to
> > create
> > > /
> > > > > > > retrieve the GPUManager only in the operators that need it, e.g.,
> > > > with
> > > > > a
> > > > > > > static method `GPUManager.get()`.
> > > > > > >
> > > > > > > @Stephan,
> > > > > > > I agree with you on excluding GPUManager from
> > TaskManagerServices.
> > > > > > >
> > > > > > > - For the first step, where we provide unified TM-level GPU
> > > > > > information
> > > > > > > to all operators, it should be fine to have operators access /
> > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > - In future, we might have some more fine-grained GPU
> > > management,
> > > > > > where
> > > > > > > we need to maintain GPUManager as a service and put GPU info
> > in
> > > > slot
> > > > > > > profiles. But at least for now it's not necessary to introduce
> > > > such
> > > > > > > complexity.
> > > > > > >
> > > > > > > However, I have some concerns on excluding GPUManager from
> > > > > RuntimeContext
> > > > > > > and let operators access it directly.
> > > > > > >
> > > > > > > - Configurations needed for creating the GPUManager is not
> > > always
> > > > > > > available for operators.
> > > > > > > - If later we want to have fine-grained control over GPU
> > (e.g.,
> > > > > > > operators in each slot can only see GPUs reserved for that
> > > slot),
> > > > > the
> > > > > > > approach cannot be easily extended.
> > > > > > >
> > > > > > > I would suggest to wrap the GPUManager behind RuntimeContext and
> > > only
> > > > > > > expose the GPUInfo to users. For now, we can declare a method
> > > > > > > `getGPUInfo()` in RuntimeContext, with a default definition that
> > > > calls
> > > > > > > `GPUManager.get()` to get the lazily-created GPUManager. If later
> > > we
> > > > > want
> > > > > > > to create / retrieve GPUManager in a different way, we can simply
> > > > > change
> > > > > > > how `getGPUInfo` is implemented, without needing to change any
> > > public
> > > > > > > interfaces.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <[hidden email]>
> > > > > wrote:
> > > > > > >
> > > > > > > > @Shephan
> > > > > > > > Do you mean Minicluster? Yes, it makes sense to share the GPU
> > > > Manager
> > > > > > > > in such scenario.
> > > > > > > > If that's what you worry about, I'm +1 for holding
> > > > > > > > GPUManager(ExternalResourceManagers) in TaskExecutor instead of
> > > > > > > > TaskManagerServices.
> > > > > > > >
> > > > > > > > Regarding the RuntimeContext/FunctionContext, it just holds the
> > > GPU
> > > > > > > > info instead of the GPU Manager. AFAIK, it's the only place we
> > > > could
> > > > > > > > pass GPU info to the RichFunction/UserDefinedFunction.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried <
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000 [hidden email]
> > wrote
> > > > > ----
> > > > > > > > >
> > > > > > > > > > > Can we somehow keep this out of the TaskManager services
> > > > > > > > > > I fear that we could not. IMO, the GPUManager(or
> > > > > > > > > > ExternalServicesManagers in future) is conceptually one of
> > > the
> > > > > task
> > > > > > > > > > manager services, just like MemoryManager before 1.10.
> > > > > > > > > > - It maintains/holds the GPU resource at TM level and all
> > of
> > > > the
> > > > > > > > > > operators allocate the GPU resources from it. So, it should
> > > be
> > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > - We could add a collection called ExternalResourceManagers
> > > to
> > > > > hold
> > > > > > > > > > all managers of other external resources in the future.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Can you help me understand why this needs the addition in
> > > > > > > > TaskMagerServices
> > > > > > > > > or in the RuntimeContext?
> > > > > > > > > Are you worried about the case when multiple Task Executors
> > run
> > > > in
> > > > > > the
> > > > > > > > same
> > > > > > > > > JVM? That's not common, but wouldn't it actually be good in
> > > that
> > > > > case
> > > > > > > to
> > > > > > > > > share the GPU Manager, given that the GPU is shared?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Stephan
> > > > > > > > >
> > > > > > > > > ---------------------------
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > > What parts need information about this?
> > > > > > > > > > In this FLIP, operators need the information. Thus, we
> > expose
> > > > GPU
> > > > > > > > > > information to the RuntimeContext/FunctionContext. The slot
> > > > > profile
> > > > > > > is
> > > > > > > > > > not aware of GPU resources as GPU is TM level resource now.
> > > > > > > > > >
> > > > > > > > > > > Can the GPU Manager be a "self contained" thing that
> > simply
> > > > > takes
> > > > > > > the
> > > > > > > > > > configuration, and then abstracts everything internally?
> > > > > > > > > > Yes, we just pass the path/args of the discover script and
> > > how
> > > > > many
> > > > > > > > > > GPUs per TM to it. It takes the responsibility to get the
> > GPU
> > > > > > > > > > information and expose them to the
> > > > RuntimeContext/FunctionContext
> > > > > > of
> > > > > > > > > > Operators. Meanwhile, we'd better not allow operators to
> > > > directly
> > > > > > > > > > access GPUManager, it should get what they want from
> > Context.
> > > > We
> > > > > > > could
> > > > > > > > > > then decouple the interface/implementation of GPUManager
> > and
> > > > > Public
> > > > > > > > > > API.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <
> > > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > It sounds fine to initially start with GPU specific
> > support
> > > > and
> > > > > > > think
> > > > > > > > > > about
> > > > > > > > > > > generalizing this once we better understand the space.
> > > > > > > > > > >
> > > > > > > > > > > About the implementation suggested in FLIP-108:
> > > > > > > > > > > - Can we somehow keep this out of the TaskManager
> > services?
> > > > > > > Anything
> > > > > > > > we
> > > > > > > > > > > have to pull through all layers of the TM makes the TM
> > > > > components
> > > > > > > yet
> > > > > > > > > > more
> > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > >
> > > > > > > > > > > - What parts need information about this?
> > > > > > > > > > > -> do the slot profiles need information about the GPU?
> > > > > > > > > > > -> Can the GPU Manager be a "self contained" thing that
> > > > simply
> > > > > > > takes
> > > > > > > > > > > the configuration, and then abstracts everything
> > > internally?
> > > > > > > > Operators
> > > > > > > > > > can
> > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > >
> > > > > > > > > > > > @Becket
> > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're right, I'll add
> > > > them
> > > > > to
> > > > > > > the
> > > > > > > > > > > > Public API section.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > Regarding the general extended resource mechanism, I
> > > second
> > > > > > > > Xintong's
> > > > > > > > > > > > suggestion.
> > > > > > > > > > > > - It's better to leverage ResourceProfile and
> > > ResourceSpec
> > > > > > after
> > > > > > > we
> > > > > > > > > > > > supporting fine-grained GPU scheduling. As a first step
> > > > > > > proposal, I
> > > > > > > > > > > > prefer to not include it in the scope of this FLIP.
> > > > > > > > > > > > - Regarding the "Extended Resource Manager", if I
> > > > understand
> > > > > > > > > > > > correctly, it just a code refactoring atm, we could
> > > extract
> > > > > the
> > > > > > > > > > > > open/close/allocateExtendResources of GPUManager to
> > that
> > > > > > > > interface. If
> > > > > > > > > > > > that is the case, +1 to do it during implementation.
> > > > > > > > > > > >
> > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > As Xintong said, we looked into how Spark supports a
> > > > general
> > > > > > > > "Custom
> > > > > > > > > > > > Resource Scheduling" before and decided to introduce a
> > > > common
> > > > > > > > resource
> > > > > > > > > > > > configuration
> > > > > > > > > > > >
> > > > > > >
> > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > to make it more extensible. I think the "resource" is a
> > > > > proper
> > > > > > > > level
> > > > > > > > > > > > to contain all the configs of extended resources.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo Huang <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > >
> > > > > > > > > > > > > There is no doubt that GPU resource management
> > support
> > > > will
> > > > > > > > greatly
> > > > > > > > > > > > > facilitate the development of AI-related applications
> > > by
> > > > > > > PyFlink
> > > > > > > > > > users.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have only one comment about this wiki:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regarding the names of several GPU configurations, I
> > > > think
> > > > > it
> > > > > > > is
> > > > > > > > > > better
> > > > > > > > > > > > to
> > > > > > > > > > > > > delete the resource field makes it consistent with
> > the
> > > > > names
> > > > > > of
> > > > > > > > other
> > > > > > > > > > > > > resource-related configurations in TaskManagerOption.
> > > > > > > > > > > > >
> > > > > > > > > > > > > e.g. taskmanager.resource.gpu.discovery-script.path
> > ->
> > > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xintong Song <[hidden email]> 于2020年3月4日周三
> > > > > 上午10:39写道：
> > > > > > > > > > > > >
> > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Actually, Yangze, Yang and I also had an offline
> > > > > discussion
> > > > > > > > about
> > > > > > > > > > > > making
> > > > > > > > > > > > > > the "GPU Support" as some general "Extended
> > Resource
> > > > > > > Support".
> > > > > > > > We
> > > > > > > > > > > > believe
> > > > > > > > > > > > > > supporting extended resources in a general
> > mechanism
> > > is
> > > > > > > > definitely
> > > > > > > > > > a
> > > > > > > > > > > > good
> > > > > > > > > > > > > > and extensible way. The reason we propose this FLIP
> > > > > > narrowing
> > > > > > > > its
> > > > > > > > > > scope
> > > > > > > > > > > > > > down to GPU alone, is mainly for the concern on
> > extra
> > > > > > efforts
> > > > > > > > and
> > > > > > > > > > > > review
> > > > > > > > > > > > > > capacity needed for a general mechanism.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > To come up with a well design on a general extended
> > > > > > resource
> > > > > > > > > > management
> > > > > > > > > > > > > > mechanism, we would need to investigate more on how
> > > > > people
> > > > > > > use
> > > > > > > > > > > > different
> > > > > > > > > > > > > > kind of resources in practice. For GPU, we learnt
> > > such
> > > > > > > > knowledge
> > > > > > > > > > from
> > > > > > > > > > > > the
> > > > > > > > > > > > > > experts, Becket and his team members. But for FPGA,
> > > or
> > > > > > other
> > > > > > > > > > potential
> > > > > > > > > > > > > > extended resources, we don't have such convenient
> > > > > > information
> > > > > > > > > > sources,
> > > > > > > > > > > > > > making the investigation requires more efforts,
> > > which I
> > > > > > tend
> > > > > > > to
> > > > > > > > > > think
> > > > > > > > > > > > is
> > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On the other hand, we also looked into how Spark
> > > > > supports a
> > > > > > > > general
> > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > Resource Scheduling". Assuming we want to have a
> > > > similar
> > > > > > > > general
> > > > > > > > > > > > extended
> > > > > > > > > > > > > > resource mechanism in the future, we believe that
> > the
> > > > > > current
> > > > > > > > GPU
> > > > > > > > > > > > support
> > > > > > > > > > > > > > design can be easily extended, in an incremental
> > way
> > > > > > without
> > > > > > > > too
> > > > > > > > > > many
> > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - The most important part is probably user
> > > interfaces.
> > > > > > Spark
> > > > > > > > > > offers
> > > > > > > > > > > > > > configuration options to define the amount,
> > discovery
> > > > > > script
> > > > > > > > and
> > > > > > > > > > > > vendor
> > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > k8s) in a per resource type bias [1], which is very
> > > > > similar
> > > > > > > to
> > > > > > > > > > what
> > > > > > > > > > > > we
> > > > > > > > > > > > > > proposed in this FLIP. I think it's not necessary
> > to
> > > > > expose
> > > > > > > > > > config
> > > > > > > > > > > > > > options
> > > > > > > > > > > > > > in the general way atm, since we do not have
> > supports
> > > > for
> > > > > > > other
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > types now. If later we decided to have per resource
> > > > type
> > > > > > > config
> > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > can have backwards compatibility on the current
> > > > proposed
> > > > > > > > options
> > > > > > > > > > > > with
> > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > - For the GPU Manager, if later needed we can
> > change
> > > it
> > > > > to
> > > > > > a
> > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > Resource Manager" (or whatever it is called). That
> > > > should
> > > > > > be
> > > > > > > a
> > > > > > > > > > pure
> > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > - For ResourceProfile and ResourceSpec, there are
> > > > already
> > > > > > > > > > fields for
> > > > > > > > > > > > > > general extended resource. We can of course
> > leverage
> > > > them
> > > > > > > when
> > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > fine grained GPU scheduling. That is also not in
> > the
> > > > > scope
> > > > > > of
> > > > > > > > > > this
> > > > > > > > > > > > first
> > > > > > > > > > > > > > step proposal, and would require FLIP-56 to be
> > > finished
> > > > > > > first.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > To summary up, I agree with Becket that have a
> > > separate
> > > > > > FLIP
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > > > general extended resource mechanism, and keep it in
> > > > mind
> > > > > > when
> > > > > > > > > > > > discussing
> > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket Qin <
> > > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > That's a good point, Stephan. It makes total
> > sense
> > > to
> > > > > > > > generalize
> > > > > > > > > > the
> > > > > > > > > > > > > > > resource management to support custom resources.
> > > > Having
> > > > > > > that
> > > > > > > > > > allows
> > > > > > > > > > > > users
> > > > > > > > > > > > > > > to add new resources by themselves. The general
> > > > > resource
> > > > > > > > > > management
> > > > > > > > > > > > may
> > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. The custom resource type definition. It is
> > > > supported
> > > > > > by
> > > > > > > > the
> > > > > > > > > > > > extended
> > > > > > > > > > > > > > > resources in ResourceProfile and ResourceSpec.
> > This
> > > > > will
> > > > > > > > likely
> > > > > > > > > > cover
> > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. The custom resource allocation logic, i.e. how
> > > to
> > > > > > assign
> > > > > > > > the
> > > > > > > > > > > > resources
> > > > > > > > > > > > > > > to different tasks, operators, and so on. This
> > may
> > > > > > require
> > > > > > > > two
> > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > a. Subtask level - make sure the subtasks are put
> > > > into
> > > > > > > > > > suitable
> > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > It is done by the global RM and is not
> > customizable
> > > > > right
> > > > > > > > now.
> > > > > > > > > > > > > > > b. Operator level - map the exact resource to the
> > > > > > operators
> > > > > > > > > > in
> > > > > > > > > > > > TM.
> > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for operator B. This
> > > step
> > > > > is
> > > > > > > > needed
> > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > the global RM does not distinguish individual
> > > > resources
> > > > > > of
> > > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > > type.
> > > > > > > > > > > > > > > It is true for memory, but not for GPU.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The GPU manager is designed to do 2.b here. So it
> > > > > should
> > > > > > > > > > discover the
> > > > > > > > > > > > > > > physical GPU information and bind/match them to
> > > each
> > > > > > > > operators.
> > > > > > > > > > > > Making
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > general will fill in the missing piece to support
> > > > > custom
> > > > > > > > resource
> > > > > > > > > > > > type
> > > > > > > > > > > > > > > definition. But I'd avoid calling it a "External
> > > > > Resource
> > > > > > > > > > Manager" to
> > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > confusion with RM, maybe something like "Operator
> > > > > > Resource
> > > > > > > > > > Assigner"
> > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > be more accurate. So for each resource type users
> > > can
> > > > > > have
> > > > > > > an
> > > > > > > > > > > > optional
> > > > > > > > > > > > > > > "Operator Resource Assigner" in the TM. For
> > memory,
> > > > > users
> > > > > > > > don't
> > > > > > > > > > need
> > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > but for other extended resources, users may need
> > > > that.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Personally I think a pluggable "Operator Resource
> > > > > > Assigner"
> > > > > > > > is
> > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > in this FLIP. But I am also OK with having that
> > in
> > > a
> > > > > > > separate
> > > > > > > > > > FLIP
> > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > the interface between the "Operator Resource
> > > > Assigner"
> > > > > > and
> > > > > > > > > > operator
> > > > > > > > > > > > may
> > > > > > > > > > > > > > > take a while to settle down if we want to make it
> > > > > > generic.
> > > > > > > > But I
> > > > > > > > > > > > think
> > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > implementation should take this future work into
> > > > > > > > consideration so
> > > > > > > > > > > > that we
> > > > > > > > > > > > > > > don't need to break backwards compatibility once
> > we
> > > > > have
> > > > > > > > that.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM Stephan Ewen <
> > > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you for writing this FLIP.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I cannot really give much input into the
> > > mechanics
> > > > of
> > > > > > > > GPU-aware
> > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > and GPU allocation, as I have no experience
> > with
> > > > > that.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > One thought I had when reading the proposal is
> > if
> > > > it
> > > > > > > makes
> > > > > > > > > > sense to
> > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > the "GPU Manager" as an "External Resource
> > > > Manager",
> > > > > > and
> > > > > > > > GPU
> > > > > > > > > > is one
> > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > The way I understand the ResourceProfile and
> > > > > > > ResourceSpec,
> > > > > > > > > > that is
> > > > > > > > > > > > how
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > It has the advantage that it looks more
> > > extensible.
> > > > > > Maybe
> > > > > > > > > > there is
> > > > > > > > > > > > a
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU Resource,
> > and
> > > > FPGA
> > > > > > > > > > Resource, a
> > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM Becket Qin <
> > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU resource
> > > > management
> > > > > > > > support
> > > > > > > > > > is a
> > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > for machine learning use cases. Actually it
> > is
> > > > one
> > > > > of
> > > > > > > the
> > > > > > > > > > mostly
> > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > question from the users who are interested in
> > > > using
> > > > > > > Flink
> > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Some quick comments / questions to the wiki.
> > > > > > > > > > > > > > > > > 1. The WebUI / REST API should probably also
> > be
> > > > > > > > mentioned in
> > > > > > > > > > the
> > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > 2. Is the data structure that holds GPU info
> > > > also a
> > > > > > > > public
> > > > > > > > > > API?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM Xintong Song
> > <
> > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks for drafting the FLIP and kicking
> > off
> > > > the
> > > > > > > > > > discussion,
> > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Big +1 for this feature. Supporting using
> > of
> > > > GPU
> > > > > in
> > > > > > > > Flink
> > > > > > > > > > is
> > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > especially for the ML scenarios.
> > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki doc and it
> > looks
> > > > good
> > > > > > to
> > > > > > > > me. I
> > > > > > > > > > > > think
> > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > very good first step for Flink's GPU
> > > supports.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM Yangze Guo
> > <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > We would like to start a discussion
> > thread
> > > on
> > > > > > > > "FLIP-108:
> > > > > > > > > > Add
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > This FLIP mainly discusses the following
> > > > > issues:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > - Enable user to configure how many GPUs
> > > in a
> > > > > > task
> > > > > > > > > > executor
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > forward such requirements to the external
> > > > > > resource
> > > > > > > > > > managers
> > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos setups).
> > > > > > > > > > > > > > > > > > > - Provide information of available GPU
> > > > > resources
> > > > > > to
> > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Key changes proposed in the FLIP are as
> > > > > follows:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > - Forward GPU resource requirements to
> > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > - Introduce GPUManager as one of the task
> > > > > manager
> > > > > > > > > > services to
> > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > and expose GPU resource information to
> > the
> > > > > > context
> > > > > > > of
> > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > - Introduce the default script for GPU
> > > > > discovery,
> > > > > > > in
> > > > > > > > > > which we
> > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > the privilege mode to help user to
> > achieve
> > > > > > > > worker-level
> > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Please find more details in the FLIP wiki
> > > > > > document
> > > > > > > > [1].
> > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >

Stephan Ewen

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Nice, thanks a lot!

On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <[hidden email]> wrote:

> Thanks for the suggestion, @Stephan, @Becket and @Xintong.
>
> I've updated the FLIP accordingly. I do not add a
> ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
> which takes the responsibility of all relevant operations on both RM
> and TM sides.
> After a rethink about decoupling the management of external resources
> from TaskExecutor, I think we could do the same thing on the
> ResourceManager side. We do not need to add a specific allocation
> logic to the ResourceManager each time we add a specific external
> resource.
> - For Yarn, we need the ExternalResourceDriver to edit the
> containerRequest.
> - For Kubenetes, ExternalResourceDriver could provide a decorator for
> the TM pod.
>
> In this way, just like MetricReporter, we allow users to define their
> custom ExternalResourceDriver. It is more extensible and fits the
> separation of concerns. For more details, please take a look at [1].
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
>
> Best,
> Yangze Guo
>
> On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <[hidden email]> wrote:
> >
> > This sounds good to go ahead from my side.
> >
> > I like the approach that Becket suggested - in that case the core
> > abstraction that everyone would need to understand would be "external
> > resource allocation" and the "ResourceInfoProvider", and the GPU specific
> > code would be a specific implementation only known to that component that
> > allocates the external resource. That fits the separation of concerns
> well.
> >
> > I also understand that it should not be over-engineered in the first
> > version, so some simplification makes sense, and then gradually expand
> from
> > there.
> >
> > So +1 to go ahead with what was suggested above (Xintong / Becket) from
> my
> > side.
> >
> > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <[hidden email]>
> wrote:
> >
> > > Thanks for the comments, Stephan & Becket.
> > >
> > > @Stephan
> > >
> > > I see your concern, and I completely agree with you that we should
> first
> > > think about the "library" / "plugin" / "extension" style if possible.
> > >
> > > If GPUs are sliced and assigned during scheduling, there may be reason,
> > > > although it looks that it would belong to the slot then. Is that
> what we
> > > > are doing here?
> > >
> > >
> > > In the current proposal, we do not have the GPUs sliced and assigned to
> > > slots, because it could be problematic without dynamic slot allocation.
> > > E.g., the number of GPUs might not be evenly divisible by the number of
> > > slots.
> > >
> > > I think it makes sense to eventually have the GPUs assigned to slots.
> Even
> > > then, we might still need a TM level GPUManager (or ResourceProvider
> like
> > > Becket suggested). For memory, in each slot we can simply request the
> > > amount of memory, leaving it to JVM / OS to decide which memory
> (address)
> > > should be assigned. For GPU, and potentially other resources like
> FPGA, we
> > > need to explicitly specify which GPU (index) should be used.
> Therefore, we
> > > need some component at the TM level to coordinate which slot uses which
> > > GPU.
> > >
> > > IMO, unless we say Flink will not support slot-level GPU slicing at
> least
> > > in the foreseeable future, I don't see a good way to avoid touching
> the TM
> > > core. To that end, I think Becket's suggestion points to a good
> direction,
> > > that supports more features (GPU, FPGA, etc.) with less coupling to
> the TM
> > > core (only needs to understand the general interfaces). The detailed
> > > implementation for specific resource types can even be encapsulated as
> a
> > > library.
> > >
> > > @Becket
> > >
> > > Thanks for sharing your thought on the final state. Despite the
> details how
> > > the interfaces should look like, I think this is a really good
> abstraction
> > > for supporting general resource types.
> > >
> > > I'd like to further clarify that, the following three things are all
> that
> > > the "Flink core" needs to understand.
> > >
> > > - The *amount* of resource, for scheduling. Actually, we already
> have
> > > the Resource class in ResourceProfile and ResourceSpec for extended
> > > resource. It's just not really used.
> > > - The *info*, that Flink provides to the operators / user codes.
> > > - The *provider*, which generates the info based on the amount.
> > >
> > > The "core" does not need to understand the specific implementation
> details
> > > of the above three. They can even be implemented in a 3rd-party
> library.
> > > Similar to how we allow users to define their custom MetricReporter.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <[hidden email]>
> wrote:
> > >
> > > > Thanks for the comment, Stephan.
> > > >
> > > > - If everything becomes a "core feature", it will make the project
> hard
> > > > > to develop in the future. Thinking "library" / "plugin" /
> "extension"
> > > > style
> > > > > where possible helps.
> > > >
> > > >
> > > > Completely agree. It is much more important to design a mechanism
> than
> > > > focusing on a specific case. Here is what I am thinking to fully
> support
> > > > custom resource management:
> > > > 1. On the JM / RM side, use ResourceProfile and ResourceSpec to
> define
> > > the
> > > > resource and the amount required. They will be used to find suitable
> TMs
> > > > slots to run the tasks. At this point, the resources are only
> measured by
> > > > amount, i.e. they do not have individual ID.
> > > >
> > > > 2. On the TM side, have something like *"ResourceInfoProvider"* to
> > > identify
> > > > and provides the detail information of the individual resource, e.g.
> GPU
> > > > ID.. It is important because the operator may have to explicitly
> interact
> > > > with the physical resource it uses. The ResourceInfoProvider might
> look
> > > > like something below.
> > > > interface ResourceInfoProvider<INFO> {
> > > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId opId,
> > > > ResourceProfile resourceProfile);
> > > > }
> > > >
> > > > - There could be several "*ResourceInfoProvider*" configured on the
> TM to
> > > > retrieve the information for different resources.
> > > > - The TM will be responsible to assign those individual resources to
> each
> > > > operator according to their requested amount.
> > > > - The operators will be able to get the ResourceInfo from their
> > > > RuntimeContext.
> > > >
> > > > If we agree this is a reasonable final state. We can adapt the
> current
> > > FLIP
> > > > to it. In fact it does not sound a big change to me. All the proposed
> > > > configuration can be as is, it is just that Flink itself won't care
> about
> > > > them, instead a GPUInfoProviver implementing the ResourceInfoProvider
> > > will
> > > > use them.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <[hidden email]>
> wrote:
> > > >
> > > > > Hi all!
> > > > >
> > > > > The main point I wanted to throw into the discussion is the
> following:
> > > > > - With more and more use cases, more and more tools go into Flink
> > > > > - If everything becomes a "core feature", it will make the
> project
> > > hard
> > > > > to develop in the future. Thinking "library" / "plugin" /
> "extension"
> > > > style
> > > > > where possible helps.
> > > > >
> > > > > - A good thought experiment is always: How many future developers
> > > have
> > > > to
> > > > > interact with this code (and possibly understand it partially),
> even if
> > > > the
> > > > > features they touch have nothing to do with GPU support. If many
> > > > > contributors to unrelated features will have to touch it and
> understand
> > > > it,
> > > > > then let's think if there is a different solution. Maybe there is
> not,
> > > > but
> > > > > then we should be sure why.
> > > > >
> > > > > - That led me to raising this issue: If the GPU manager becomes a
> > > core
> > > > > service in the TaskManager, Environment, RuntimeContext, etc. then
> > > > everyone
> > > > > developing TM and streaming tasks need to understand the GPU
> manager.
> > > > That
> > > > > seems oddly specific, is my impression.
> > > > >
> > > > > Access to configuration seems not the right reason to do that. We
> > > should
> > > > > expose the Flink configuration from the RuntimeContext anyways.
> > > > >
> > > > > If GPUs are sliced and assigned during scheduling, there may be
> reason,
> > > > > although it looks that it would belong to the slot then. Is that
> what
> > > we
> > > > > are doing here?
> > > > >
> > > > > Best,
> > > > > Stephan
> > > > >
> > > > >
> > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Thanks for the feedback, Becket.
> > > > > >
> > > > > > IMO, eventually an operator should only see info of GPUs that are
> > > > > dedicated
> > > > > > for it, instead of all GPUs on the machine/container in the
> current
> > > > > design.
> > > > > > It does not make sense to let the user who writes a UDF to worry
> > > about
> > > > > > coordination among multiple operators running on the same
> machine.
> > > And
> > > > if
> > > > > > we want to limit the GPU info an operator sees, we should not
> let the
> > > > > > operator to instantiate GPUManager, which means we have to expose
> > > > > something
> > > > > > through runtime context, either GPU info or some kind of limited
> > > access
> > > > > to
> > > > > > the GPUManager.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <[hidden email]
> >
> > > > wrote:
> > > > > >
> > > > > > > It probably make sense for us to first agree on the final
> state.
> > > More
> > > > > > > specifically, will the resource info be exposed through runtime
> > > > context
> > > > > > > eventually?
> > > > > > >
> > > > > > > If that is the final state and we have a seamless migration
> story
> > > > from
> > > > > > this
> > > > > > > FLIP to that final state, Personally I think it is OK to
> expose the
> > > > GPU
> > > > > > > info in the runtime context.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > @Yangze,
> > > > > > > > I think what Stephan means (@Stephan, please correct me if
> I'm
> > > > wrong)
> > > > > > is
> > > > > > > > that, we might not need to hold and maintain the GPUManager
> as a
> > > > > > service
> > > > > > > in
> > > > > > > > TaskManagerServices or RuntimeContext. An alternative is to
> > > create
> > > > /
> > > > > > > > retrieve the GPUManager only in the operators that need it,
> e.g.,
> > > > > with
> > > > > > a
> > > > > > > > static method `GPUManager.get()`.
> > > > > > > >
> > > > > > > > @Stephan,
> > > > > > > > I agree with you on excluding GPUManager from
> > > TaskManagerServices.
> > > > > > > >
> > > > > > > > - For the first step, where we provide unified TM-level
> GPU
> > > > > > > information
> > > > > > > > to all operators, it should be fine to have operators
> access /
> > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > - In future, we might have some more fine-grained GPU
> > > > management,
> > > > > > > where
> > > > > > > > we need to maintain GPUManager as a service and put GPU
> info
> > > in
> > > > > slot
> > > > > > > > profiles. But at least for now it's not necessary to
> introduce
> > > > > such
> > > > > > > > complexity.
> > > > > > > >
> > > > > > > > However, I have some concerns on excluding GPUManager from
> > > > > > RuntimeContext
> > > > > > > > and let operators access it directly.
> > > > > > > >
> > > > > > > > - Configurations needed for creating the GPUManager is not
> > > > always
> > > > > > > > available for operators.
> > > > > > > > - If later we want to have fine-grained control over GPU
> > > (e.g.,
> > > > > > > > operators in each slot can only see GPUs reserved for that
> > > > slot),
> > > > > > the
> > > > > > > > approach cannot be easily extended.
> > > > > > > >
> > > > > > > > I would suggest to wrap the GPUManager behind RuntimeContext
> and
> > > > only
> > > > > > > > expose the GPUInfo to users. For now, we can declare a method
> > > > > > > > `getGPUInfo()` in RuntimeContext, with a default definition
> that
> > > > > calls
> > > > > > > > `GPUManager.get()` to get the lazily-created GPUManager. If
> later
> > > > we
> > > > > > want
> > > > > > > > to create / retrieve GPUManager in a different way, we can
> simply
> > > > > > change
> > > > > > > > how `getGPUInfo` is implemented, without needing to change
> any
> > > > public
> > > > > > > > interfaces.
> > > > > > > >
> > > > > > > > Thank you~
> > > > > > > >
> > > > > > > > Xintong Song
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> [hidden email]>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > @Shephan
> > > > > > > > > Do you mean Minicluster? Yes, it makes sense to share the
> GPU
> > > > > Manager
> > > > > > > > > in such scenario.
> > > > > > > > > If that's what you worry about, I'm +1 for holding
> > > > > > > > > GPUManager(ExternalResourceManagers) in TaskExecutor
> instead of
> > > > > > > > > TaskManagerServices.
> > > > > > > > >
> > > > > > > > > Regarding the RuntimeContext/FunctionContext, it just
> holds the
> > > > GPU
> > > > > > > > > info instead of the GPU Manager. AFAIK, it's the only
> place we
> > > > > could
> > > > > > > > > pass GPU info to the RichFunction/UserDefinedFunction.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000 [hidden email]
> > > wrote
> > > > > > ----
> > > > > > > > > >
> > > > > > > > > > > > Can we somehow keep this out of the TaskManager
> services
> > > > > > > > > > > I fear that we could not. IMO, the GPUManager(or
> > > > > > > > > > > ExternalServicesManagers in future) is conceptually
> one of
> > > > the
> > > > > > task
> > > > > > > > > > > manager services, just like MemoryManager before 1.10.
> > > > > > > > > > > - It maintains/holds the GPU resource at TM level and
> all
> > > of
> > > > > the
> > > > > > > > > > > operators allocate the GPU resources from it. So, it
> should
> > > > be
> > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > - We could add a collection called
> ExternalResourceManagers
> > > > to
> > > > > > hold
> > > > > > > > > > > all managers of other external resources in the future.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Can you help me understand why this needs the addition in
> > > > > > > > > TaskMagerServices
> > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > Are you worried about the case when multiple Task
> Executors
> > > run
> > > > > in
> > > > > > > the
> > > > > > > > > same
> > > > > > > > > > JVM? That's not common, but wouldn't it actually be good
> in
> > > > that
> > > > > > case
> > > > > > > > to
> > > > > > > > > > share the GPU Manager, given that the GPU is shared?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Stephan
> > > > > > > > > >
> > > > > > > > > > ---------------------------
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > In this FLIP, operators need the information. Thus, we
> > > expose
> > > > > GPU
> > > > > > > > > > > information to the RuntimeContext/FunctionContext. The
> slot
> > > > > > profile
> > > > > > > > is
> > > > > > > > > > > not aware of GPU resources as GPU is TM level resource
> now.
> > > > > > > > > > >
> > > > > > > > > > > > Can the GPU Manager be a "self contained" thing that
> > > simply
> > > > > > takes
> > > > > > > > the
> > > > > > > > > > > configuration, and then abstracts everything
> internally?
> > > > > > > > > > > Yes, we just pass the path/args of the discover script
> and
> > > > how
> > > > > > many
> > > > > > > > > > > GPUs per TM to it. It takes the responsibility to get
> the
> > > GPU
> > > > > > > > > > > information and expose them to the
> > > > > RuntimeContext/FunctionContext
> > > > > > > of
> > > > > > > > > > > Operators. Meanwhile, we'd better not allow operators
> to
> > > > > directly
> > > > > > > > > > > access GPUManager, it should get what they want from
> > > Context.
> > > > > We
> > > > > > > > could
> > > > > > > > > > > then decouple the interface/implementation of
> GPUManager
> > > and
> > > > > > Public
> > > > > > > > > > > API.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > It sounds fine to initially start with GPU specific
> > > support
> > > > > and
> > > > > > > > think
> > > > > > > > > > > about
> > > > > > > > > > > > generalizing this once we better understand the
> space.
> > > > > > > > > > > >
> > > > > > > > > > > > About the implementation suggested in FLIP-108:
> > > > > > > > > > > > - Can we somehow keep this out of the TaskManager
> > > services?
> > > > > > > > Anything
> > > > > > > > > we
> > > > > > > > > > > > have to pull through all layers of the TM makes the
> TM
> > > > > > components
> > > > > > > > yet
> > > > > > > > > > > more
> > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > >
> > > > > > > > > > > > - What parts need information about this?
> > > > > > > > > > > > -> do the slot profiles need information about the
> GPU?
> > > > > > > > > > > > -> Can the GPU Manager be a "self contained" thing
> that
> > > > > simply
> > > > > > > > takes
> > > > > > > > > > > > the configuration, and then abstracts everything
> > > > internally?
> > > > > > > > > Operators
> > > > > > > > > > > can
> > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're right,
> I'll add
> > > > > them
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > Regarding the general extended resource mechanism,
> I
> > > > second
> > > > > > > > > Xintong's
> > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > - It's better to leverage ResourceProfile and
> > > > ResourceSpec
> > > > > > > after
> > > > > > > > we
> > > > > > > > > > > > > supporting fine-grained GPU scheduling. As a first
> step
> > > > > > > > proposal, I
> > > > > > > > > > > > > prefer to not include it in the scope of this FLIP.
> > > > > > > > > > > > > - Regarding the "Extended Resource Manager", if I
> > > > > understand
> > > > > > > > > > > > > correctly, it just a code refactoring atm, we could
> > > > extract
> > > > > > the
> > > > > > > > > > > > > open/close/allocateExtendResources of GPUManager to
> > > that
> > > > > > > > > interface. If
> > > > > > > > > > > > > that is the case, +1 to do it during
> implementation.
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > As Xintong said, we looked into how Spark supports
> a
> > > > > general
> > > > > > > > > "Custom
> > > > > > > > > > > > > Resource Scheduling" before and decided to
> introduce a
> > > > > common
> > > > > > > > > resource
> > > > > > > > > > > > > configuration
> > > > > > > > > > > > >
> > > > > > > >
> > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > to make it more extensible. I think the "resource"
> is a
> > > > > > proper
> > > > > > > > > level
> > > > > > > > > > > > > to contain all the configs of extended resources.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo Huang <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > There is no doubt that GPU resource management
> > > support
> > > > > will
> > > > > > > > > greatly
> > > > > > > > > > > > > > facilitate the development of AI-related
> applications
> > > > by
> > > > > > > > PyFlink
> > > > > > > > > > > users.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have only one comment about this wiki:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regarding the names of several GPU
> configurations, I
> > > > > think
> > > > > > it
> > > > > > > > is
> > > > > > > > > > > better
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > delete the resource field makes it consistent
> with
> > > the
> > > > > > names
> > > > > > > of
> > > > > > > > > other
> > > > > > > > > > > > > > resource-related configurations in
> TaskManagerOption.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > e.g.
> taskmanager.resource.gpu.discovery-script.path
> > > ->
> > > > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xintong Song <[hidden email]>
> 于2020年3月4日周三
> > > > > > 上午10:39写道：
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Actually, Yangze, Yang and I also had an
> offline
> > > > > > discussion
> > > > > > > > > about
> > > > > > > > > > > > > making
> > > > > > > > > > > > > > > the "GPU Support" as some general "Extended
> > > Resource
> > > > > > > > Support".
> > > > > > > > > We
> > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > supporting extended resources in a general
> > > mechanism
> > > > is
> > > > > > > > > definitely
> > > > > > > > > > > a
> > > > > > > > > > > > > good
> > > > > > > > > > > > > > > and extensible way. The reason we propose this
> FLIP
> > > > > > > narrowing
> > > > > > > > > its
> > > > > > > > > > > scope
> > > > > > > > > > > > > > > down to GPU alone, is mainly for the concern on
> > > extra
> > > > > > > efforts
> > > > > > > > > and
> > > > > > > > > > > > > review
> > > > > > > > > > > > > > > capacity needed for a general mechanism.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > To come up with a well design on a general
> extended
> > > > > > > resource
> > > > > > > > > > > management
> > > > > > > > > > > > > > > mechanism, we would need to investigate more
> on how
> > > > > > people
> > > > > > > > use
> > > > > > > > > > > > > different
> > > > > > > > > > > > > > > kind of resources in practice. For GPU, we
> learnt
> > > > such
> > > > > > > > > knowledge
> > > > > > > > > > > from
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > experts, Becket and his team members. But for
> FPGA,
> > > > or
> > > > > > > other
> > > > > > > > > > > potential
> > > > > > > > > > > > > > > extended resources, we don't have such
> convenient
> > > > > > > information
> > > > > > > > > > > sources,
> > > > > > > > > > > > > > > making the investigation requires more efforts,
> > > > which I
> > > > > > > tend
> > > > > > > > to
> > > > > > > > > > > think
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On the other hand, we also looked into how
> Spark
> > > > > > supports a
> > > > > > > > > general
> > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > Resource Scheduling". Assuming we want to have
> a
> > > > > similar
> > > > > > > > > general
> > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > resource mechanism in the future, we believe
> that
> > > the
> > > > > > > current
> > > > > > > > > GPU
> > > > > > > > > > > > > support
> > > > > > > > > > > > > > > design can be easily extended, in an
> incremental
> > > way
> > > > > > > without
> > > > > > > > > too
> > > > > > > > > > > many
> > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - The most important part is probably user
> > > > interfaces.
> > > > > > > Spark
> > > > > > > > > > > offers
> > > > > > > > > > > > > > > configuration options to define the amount,
> > > discovery
> > > > > > > script
> > > > > > > > > and
> > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > k8s) in a per resource type bias [1], which is
> very
> > > > > > similar
> > > > > > > > to
> > > > > > > > > > > what
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > proposed in this FLIP. I think it's not
> necessary
> > > to
> > > > > > expose
> > > > > > > > > > > config
> > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > in the general way atm, since we do not have
> > > supports
> > > > > for
> > > > > > > > other
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > types now. If later we decided to have per
> resource
> > > > > type
> > > > > > > > config
> > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > can have backwards compatibility on the current
> > > > > proposed
> > > > > > > > > options
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > - For the GPU Manager, if later needed we can
> > > change
> > > > it
> > > > > > to
> > > > > > > a
> > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > Resource Manager" (or whatever it is called).
> That
> > > > > should
> > > > > > > be
> > > > > > > > a
> > > > > > > > > > > pure
> > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > - For ResourceProfile and ResourceSpec, there
> are
> > > > > already
> > > > > > > > > > > fields for
> > > > > > > > > > > > > > > general extended resource. We can of course
> > > leverage
> > > > > them
> > > > > > > > when
> > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > fine grained GPU scheduling. That is also not
> in
> > > the
> > > > > > scope
> > > > > > > of
> > > > > > > > > > > this
> > > > > > > > > > > > > first
> > > > > > > > > > > > > > > step proposal, and would require FLIP-56 to be
> > > > finished
> > > > > > > > first.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > To summary up, I agree with Becket that have a
> > > > separate
> > > > > > > FLIP
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > > > > general extended resource mechanism, and keep
> it in
> > > > > mind
> > > > > > > when
> > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket Qin <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > That's a good point, Stephan. It makes total
> > > sense
> > > > to
> > > > > > > > > generalize
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > resource management to support custom
> resources.
> > > > > Having
> > > > > > > > that
> > > > > > > > > > > allows
> > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > to add new resources by themselves. The
> general
> > > > > > resource
> > > > > > > > > > > management
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 1. The custom resource type definition. It is
> > > > > supported
> > > > > > > by
> > > > > > > > > the
> > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > resources in ResourceProfile and
> ResourceSpec.
> > > This
> > > > > > will
> > > > > > > > > likely
> > > > > > > > > > > cover
> > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2. The custom resource allocation logic,
> i.e. how
> > > > to
> > > > > > > assign
> > > > > > > > > the
> > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > to different tasks, operators, and so on.
> This
> > > may
> > > > > > > require
> > > > > > > > > two
> > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > a. Subtask level - make sure the subtasks
> are put
> > > > > into
> > > > > > > > > > > suitable
> > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > It is done by the global RM and is not
> > > customizable
> > > > > > right
> > > > > > > > > now.
> > > > > > > > > > > > > > > > b. Operator level - map the exact resource
> to the
> > > > > > > operators
> > > > > > > > > > > in
> > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for operator B.
> This
> > > > step
> > > > > > is
> > > > > > > > > needed
> > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > the global RM does not distinguish individual
> > > > > resources
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > same
> > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > It is true for memory, but not for GPU.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The GPU manager is designed to do 2.b here.
> So it
> > > > > > should
> > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > physical GPU information and bind/match them
> to
> > > > each
> > > > > > > > > operators.
> > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > general will fill in the missing piece to
> support
> > > > > > custom
> > > > > > > > > resource
> > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > definition. But I'd avoid calling it a
> "External
> > > > > > Resource
> > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > confusion with RM, maybe something like
> "Operator
> > > > > > > Resource
> > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > be more accurate. So for each resource type
> users
> > > > can
> > > > > > > have
> > > > > > > > an
> > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > "Operator Resource Assigner" in the TM. For
> > > memory,
> > > > > > users
> > > > > > > > > don't
> > > > > > > > > > > need
> > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > but for other extended resources, users may
> need
> > > > > that.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Personally I think a pluggable "Operator
> Resource
> > > > > > > Assigner"
> > > > > > > > > is
> > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > in this FLIP. But I am also OK with having
> that
> > > in
> > > > a
> > > > > > > > separate
> > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > the interface between the "Operator Resource
> > > > > Assigner"
> > > > > > > and
> > > > > > > > > > > operator
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > take a while to settle down if we want to
> make it
> > > > > > > generic.
> > > > > > > > > But I
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > implementation should take this future work
> into
> > > > > > > > > consideration so
> > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > don't need to break backwards compatibility
> once
> > > we
> > > > > > have
> > > > > > > > > that.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM Stephan Ewen
> <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you for writing this FLIP.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I cannot really give much input into the
> > > > mechanics
> > > > > of
> > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > and GPU allocation, as I have no experience
> > > with
> > > > > > that.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > One thought I had when reading the
> proposal is
> > > if
> > > > > it
> > > > > > > > makes
> > > > > > > > > > > sense to
> > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > the "GPU Manager" as an "External Resource
> > > > > Manager",
> > > > > > > and
> > > > > > > > > GPU
> > > > > > > > > > > is one
> > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > The way I understand the ResourceProfile
> and
> > > > > > > > ResourceSpec,
> > > > > > > > > > > that is
> > > > > > > > > > > > > how
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > It has the advantage that it looks more
> > > > extensible.
> > > > > > > Maybe
> > > > > > > > > > > there is
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU
> Resource,
> > > and
> > > > > FPGA
> > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM Becket Qin <
> > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU resource
> > > > > management
> > > > > > > > > support
> > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > for machine learning use cases. Actually
> it
> > > is
> > > > > one
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > mostly
> > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > question from the users who are
> interested in
> > > > > using
> > > > > > > > Flink
> > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Some quick comments / questions to the
> wiki.
> > > > > > > > > > > > > > > > > > 1. The WebUI / REST API should probably
> also
> > > be
> > > > > > > > > mentioned in
> > > > > > > > > > > the
> > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > 2. Is the data structure that holds GPU
> info
> > > > > also a
> > > > > > > > > public
> > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM Xintong
> Song
> > > <
> > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks for drafting the FLIP and
> kicking
> > > off
> > > > > the
> > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Big +1 for this feature. Supporting
> using
> > > of
> > > > > GPU
> > > > > > in
> > > > > > > > > Flink
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > especially for the ML scenarios.
> > > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki doc and it
> > > looks
> > > > > good
> > > > > > > to
> > > > > > > > > me. I
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > very good first step for Flink's GPU
> > > > supports.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM Yangze
> Guo
> > > <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > We would like to start a discussion
> > > thread
> > > > on
> > > > > > > > > "FLIP-108:
> > > > > > > > > > > Add
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > This FLIP mainly discusses the
> following
> > > > > > issues:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > - Enable user to configure how many
> GPUs
> > > > in a
> > > > > > > task
> > > > > > > > > > > executor
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > forward such requirements to the
> external
> > > > > > > resource
> > > > > > > > > > > managers
> > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos setups).
> > > > > > > > > > > > > > > > > > > > - Provide information of available
> GPU
> > > > > > resources
> > > > > > > to
> > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Key changes proposed in the FLIP are
> as
> > > > > > follows:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > - Forward GPU resource requirements
> to
> > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > - Introduce GPUManager as one of the
> task
> > > > > > manager
> > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > and expose GPU resource information
> to
> > > the
> > > > > > > context
> > > > > > > > of
> > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > - Introduce the default script for
> GPU
> > > > > > discovery,
> > > > > > > > in
> > > > > > > > > > > which we
> > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > the privilege mode to help user to
> > > achieve
> > > > > > > > > worker-level
> > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Please find more details in the FLIP
> wiki
> > > > > > > document
> > > > > > > > > [1].
> > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
>
>

Till Rohrmann

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Hi everyone,

I'm a bit late to the party. I think the current proposal looks good.

Concerning the ExternalResourceDriver interface defined in the FLIP [1], I
would suggest to not include the decorator calls for Kubernetes and Yarn in
the base interface. Instead I would suggest to segregate the deployment
specific decorator calls into separate interfaces. That way an
ExternalResourceDriver does not have to support all deployments from the
very beginning. Moreover, some resources might not be supported by a
specific deployment target and the natural way to express this would be to
not implement the respective deployment specific interface.

Moreover, having void
addExternalResourceToRequest(AMRMClient.ContainerRequest containerRequest)
in the ExternalResourceDriver interface would require Hadoop on Flink's
classpath whenever the external resource driver is being used.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink

Cheers,
Till

On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <[hidden email]> wrote:

> Nice, thanks a lot!
>
> On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <[hidden email]> wrote:
>
> > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> >
> > I've updated the FLIP accordingly. I do not add a
> > ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
> > which takes the responsibility of all relevant operations on both RM
> > and TM sides.
> > After a rethink about decoupling the management of external resources
> > from TaskExecutor, I think we could do the same thing on the
> > ResourceManager side. We do not need to add a specific allocation
> > logic to the ResourceManager each time we add a specific external
> > resource.
> > - For Yarn, we need the ExternalResourceDriver to edit the
> > containerRequest.
> > - For Kubenetes, ExternalResourceDriver could provide a decorator for
> > the TM pod.
> >
> > In this way, just like MetricReporter, we allow users to define their
> > custom ExternalResourceDriver. It is more extensible and fits the
> > separation of concerns. For more details, please take a look at [1].
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <[hidden email]> wrote:
> > >
> > > This sounds good to go ahead from my side.
> > >
> > > I like the approach that Becket suggested - in that case the core
> > > abstraction that everyone would need to understand would be "external
> > > resource allocation" and the "ResourceInfoProvider", and the GPU
> specific
> > > code would be a specific implementation only known to that component
> that
> > > allocates the external resource. That fits the separation of concerns
> > well.
> > >
> > > I also understand that it should not be over-engineered in the first
> > > version, so some simplification makes sense, and then gradually expand
> > from
> > > there.
> > >
> > > So +1 to go ahead with what was suggested above (Xintong / Becket) from
> > my
> > > side.
> > >
> > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <[hidden email]>
> > wrote:
> > >
> > > > Thanks for the comments, Stephan & Becket.
> > > >
> > > > @Stephan
> > > >
> > > > I see your concern, and I completely agree with you that we should
> > first
> > > > think about the "library" / "plugin" / "extension" style if possible.
> > > >
> > > > If GPUs are sliced and assigned during scheduling, there may be
> reason,
> > > > > although it looks that it would belong to the slot then. Is that
> > what we
> > > > > are doing here?
> > > >
> > > >
> > > > In the current proposal, we do not have the GPUs sliced and assigned
> to
> > > > slots, because it could be problematic without dynamic slot
> allocation.
> > > > E.g., the number of GPUs might not be evenly divisible by the number
> of
> > > > slots.
> > > >
> > > > I think it makes sense to eventually have the GPUs assigned to slots.
> > Even
> > > > then, we might still need a TM level GPUManager (or ResourceProvider
> > like
> > > > Becket suggested). For memory, in each slot we can simply request the
> > > > amount of memory, leaving it to JVM / OS to decide which memory
> > (address)
> > > > should be assigned. For GPU, and potentially other resources like
> > FPGA, we
> > > > need to explicitly specify which GPU (index) should be used.
> > Therefore, we
> > > > need some component at the TM level to coordinate which slot uses
> which
> > > > GPU.
> > > >
> > > > IMO, unless we say Flink will not support slot-level GPU slicing at
> > least
> > > > in the foreseeable future, I don't see a good way to avoid touching
> > the TM
> > > > core. To that end, I think Becket's suggestion points to a good
> > direction,
> > > > that supports more features (GPU, FPGA, etc.) with less coupling to
> > the TM
> > > > core (only needs to understand the general interfaces). The detailed
> > > > implementation for specific resource types can even be encapsulated
> as
> > a
> > > > library.
> > > >
> > > > @Becket
> > > >
> > > > Thanks for sharing your thought on the final state. Despite the
> > details how
> > > > the interfaces should look like, I think this is a really good
> > abstraction
> > > > for supporting general resource types.
> > > >
> > > > I'd like to further clarify that, the following three things are all
> > that
> > > > the "Flink core" needs to understand.
> > > >
> > > > - The *amount* of resource, for scheduling. Actually, we already
> > have
> > > > the Resource class in ResourceProfile and ResourceSpec for
> extended
> > > > resource. It's just not really used.
> > > > - The *info*, that Flink provides to the operators / user codes.
> > > > - The *provider*, which generates the info based on the amount.
> > > >
> > > > The "core" does not need to understand the specific implementation
> > details
> > > > of the above three. They can even be implemented in a 3rd-party
> > library.
> > > > Similar to how we allow users to define their custom MetricReporter.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <[hidden email]>
> > wrote:
> > > >
> > > > > Thanks for the comment, Stephan.
> > > > >
> > > > > - If everything becomes a "core feature", it will make the
> project
> > hard
> > > > > > to develop in the future. Thinking "library" / "plugin" /
> > "extension"
> > > > > style
> > > > > > where possible helps.
> > > > >
> > > > >
> > > > > Completely agree. It is much more important to design a mechanism
> > than
> > > > > focusing on a specific case. Here is what I am thinking to fully
> > support
> > > > > custom resource management:
> > > > > 1. On the JM / RM side, use ResourceProfile and ResourceSpec to
> > define
> > > > the
> > > > > resource and the amount required. They will be used to find
> suitable
> > TMs
> > > > > slots to run the tasks. At this point, the resources are only
> > measured by
> > > > > amount, i.e. they do not have individual ID.
> > > > >
> > > > > 2. On the TM side, have something like *"ResourceInfoProvider"* to
> > > > identify
> > > > > and provides the detail information of the individual resource,
> e.g.
> > GPU
> > > > > ID.. It is important because the operator may have to explicitly
> > interact
> > > > > with the physical resource it uses. The ResourceInfoProvider might
> > look
> > > > > like something below.
> > > > > interface ResourceInfoProvider<INFO> {
> > > > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId opId,
> > > > > ResourceProfile resourceProfile);
> > > > > }
> > > > >
> > > > > - There could be several "*ResourceInfoProvider*" configured on the
> > TM to
> > > > > retrieve the information for different resources.
> > > > > - The TM will be responsible to assign those individual resources
> to
> > each
> > > > > operator according to their requested amount.
> > > > > - The operators will be able to get the ResourceInfo from their
> > > > > RuntimeContext.
> > > > >
> > > > > If we agree this is a reasonable final state. We can adapt the
> > current
> > > > FLIP
> > > > > to it. In fact it does not sound a big change to me. All the
> proposed
> > > > > configuration can be as is, it is just that Flink itself won't care
> > about
> > > > > them, instead a GPUInfoProviver implementing the
> ResourceInfoProvider
> > > > will
> > > > > use them.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <[hidden email]>
> > wrote:
> > > > >
> > > > > > Hi all!
> > > > > >
> > > > > > The main point I wanted to throw into the discussion is the
> > following:
> > > > > > - With more and more use cases, more and more tools go into
> Flink
> > > > > > - If everything becomes a "core feature", it will make the
> > project
> > > > hard
> > > > > > to develop in the future. Thinking "library" / "plugin" /
> > "extension"
> > > > > style
> > > > > > where possible helps.
> > > > > >
> > > > > > - A good thought experiment is always: How many future
> developers
> > > > have
> > > > > to
> > > > > > interact with this code (and possibly understand it partially),
> > even if
> > > > > the
> > > > > > features they touch have nothing to do with GPU support. If many
> > > > > > contributors to unrelated features will have to touch it and
> > understand
> > > > > it,
> > > > > > then let's think if there is a different solution. Maybe there is
> > not,
> > > > > but
> > > > > > then we should be sure why.
> > > > > >
> > > > > > - That led me to raising this issue: If the GPU manager
> becomes a
> > > > core
> > > > > > service in the TaskManager, Environment, RuntimeContext, etc.
> then
> > > > > everyone
> > > > > > developing TM and streaming tasks need to understand the GPU
> > manager.
> > > > > That
> > > > > > seems oddly specific, is my impression.
> > > > > >
> > > > > > Access to configuration seems not the right reason to do that. We
> > > > should
> > > > > > expose the Flink configuration from the RuntimeContext anyways.
> > > > > >
> > > > > > If GPUs are sliced and assigned during scheduling, there may be
> > reason,
> > > > > > although it looks that it would belong to the slot then. Is that
> > what
> > > > we
> > > > > > are doing here?
> > > > > >
> > > > > > Best,
> > > > > > Stephan
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for the feedback, Becket.
> > > > > > >
> > > > > > > IMO, eventually an operator should only see info of GPUs that
> are
> > > > > > dedicated
> > > > > > > for it, instead of all GPUs on the machine/container in the
> > current
> > > > > > design.
> > > > > > > It does not make sense to let the user who writes a UDF to
> worry
> > > > about
> > > > > > > coordination among multiple operators running on the same
> > machine.
> > > > And
> > > > > if
> > > > > > > we want to limit the GPU info an operator sees, we should not
> > let the
> > > > > > > operator to instantiate GPUManager, which means we have to
> expose
> > > > > > something
> > > > > > > through runtime context, either GPU info or some kind of
> limited
> > > > access
> > > > > > to
> > > > > > > the GPUManager.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> [hidden email]
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > It probably make sense for us to first agree on the final
> > state.
> > > > More
> > > > > > > > specifically, will the resource info be exposed through
> runtime
> > > > > context
> > > > > > > > eventually?
> > > > > > > >
> > > > > > > > If that is the final state and we have a seamless migration
> > story
> > > > > from
> > > > > > > this
> > > > > > > > FLIP to that final state, Personally I think it is OK to
> > expose the
> > > > > GPU
> > > > > > > > info in the runtime context.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > @Yangze,
> > > > > > > > > I think what Stephan means (@Stephan, please correct me if
> > I'm
> > > > > wrong)
> > > > > > > is
> > > > > > > > > that, we might not need to hold and maintain the GPUManager
> > as a
> > > > > > > service
> > > > > > > > in
> > > > > > > > > TaskManagerServices or RuntimeContext. An alternative is to
> > > > create
> > > > > /
> > > > > > > > > retrieve the GPUManager only in the operators that need it,
> > e.g.,
> > > > > > with
> > > > > > > a
> > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > >
> > > > > > > > > @Stephan,
> > > > > > > > > I agree with you on excluding GPUManager from
> > > > TaskManagerServices.
> > > > > > > > >
> > > > > > > > > - For the first step, where we provide unified TM-level
> > GPU
> > > > > > > > information
> > > > > > > > > to all operators, it should be fine to have operators
> > access /
> > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > - In future, we might have some more fine-grained GPU
> > > > > management,
> > > > > > > > where
> > > > > > > > > we need to maintain GPUManager as a service and put GPU
> > info
> > > > in
> > > > > > slot
> > > > > > > > > profiles. But at least for now it's not necessary to
> > introduce
> > > > > > such
> > > > > > > > > complexity.
> > > > > > > > >
> > > > > > > > > However, I have some concerns on excluding GPUManager from
> > > > > > > RuntimeContext
> > > > > > > > > and let operators access it directly.
> > > > > > > > >
> > > > > > > > > - Configurations needed for creating the GPUManager is
> not
> > > > > always
> > > > > > > > > available for operators.
> > > > > > > > > - If later we want to have fine-grained control over GPU
> > > > (e.g.,
> > > > > > > > > operators in each slot can only see GPUs reserved for
> that
> > > > > slot),
> > > > > > > the
> > > > > > > > > approach cannot be easily extended.
> > > > > > > > >
> > > > > > > > > I would suggest to wrap the GPUManager behind
> RuntimeContext
> > and
> > > > > only
> > > > > > > > > expose the GPUInfo to users. For now, we can declare a
> method
> > > > > > > > > `getGPUInfo()` in RuntimeContext, with a default definition
> > that
> > > > > > calls
> > > > > > > > > `GPUManager.get()` to get the lazily-created GPUManager. If
> > later
> > > > > we
> > > > > > > want
> > > > > > > > > to create / retrieve GPUManager in a different way, we can
> > simply
> > > > > > > change
> > > > > > > > > how `getGPUInfo` is implemented, without needing to change
> > any
> > > > > public
> > > > > > > > > interfaces.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> > [hidden email]>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > @Shephan
> > > > > > > > > > Do you mean Minicluster? Yes, it makes sense to share the
> > GPU
> > > > > > Manager
> > > > > > > > > > in such scenario.
> > > > > > > > > > If that's what you worry about, I'm +1 for holding
> > > > > > > > > > GPUManager(ExternalResourceManagers) in TaskExecutor
> > instead of
> > > > > > > > > > TaskManagerServices.
> > > > > > > > > >
> > > > > > > > > > Regarding the RuntimeContext/FunctionContext, it just
> > holds the
> > > > > GPU
> > > > > > > > > > info instead of the GPU Manager. AFAIK, it's the only
> > place we
> > > > > > could
> > > > > > > > > > pass GPU info to the RichFunction/UserDefinedFunction.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> [hidden email]
> > > > wrote
> > > > > > > ----
> > > > > > > > > > >
> > > > > > > > > > > > > Can we somehow keep this out of the TaskManager
> > services
> > > > > > > > > > > > I fear that we could not. IMO, the GPUManager(or
> > > > > > > > > > > > ExternalServicesManagers in future) is conceptually
> > one of
> > > > > the
> > > > > > > task
> > > > > > > > > > > > manager services, just like MemoryManager before
> 1.10.
> > > > > > > > > > > > - It maintains/holds the GPU resource at TM level and
> > all
> > > > of
> > > > > > the
> > > > > > > > > > > > operators allocate the GPU resources from it. So, it
> > should
> > > > > be
> > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > - We could add a collection called
> > ExternalResourceManagers
> > > > > to
> > > > > > > hold
> > > > > > > > > > > > all managers of other external resources in the
> future.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Can you help me understand why this needs the addition
> in
> > > > > > > > > > TaskMagerServices
> > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > Are you worried about the case when multiple Task
> > Executors
> > > > run
> > > > > > in
> > > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > JVM? That's not common, but wouldn't it actually be
> good
> > in
> > > > > that
> > > > > > > case
> > > > > > > > > to
> > > > > > > > > > > share the GPU Manager, given that the GPU is shared?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Stephan
> > > > > > > > > > >
> > > > > > > > > > > ---------------------------
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > > In this FLIP, operators need the information. Thus,
> we
> > > > expose
> > > > > > GPU
> > > > > > > > > > > > information to the RuntimeContext/FunctionContext.
> The
> > slot
> > > > > > > profile
> > > > > > > > > is
> > > > > > > > > > > > not aware of GPU resources as GPU is TM level
> resource
> > now.
> > > > > > > > > > > >
> > > > > > > > > > > > > Can the GPU Manager be a "self contained" thing
> that
> > > > simply
> > > > > > > takes
> > > > > > > > > the
> > > > > > > > > > > > configuration, and then abstracts everything
> > internally?
> > > > > > > > > > > > Yes, we just pass the path/args of the discover
> script
> > and
> > > > > how
> > > > > > > many
> > > > > > > > > > > > GPUs per TM to it. It takes the responsibility to get
> > the
> > > > GPU
> > > > > > > > > > > > information and expose them to the
> > > > > > RuntimeContext/FunctionContext
> > > > > > > > of
> > > > > > > > > > > > Operators. Meanwhile, we'd better not allow operators
> > to
> > > > > > directly
> > > > > > > > > > > > access GPUManager, it should get what they want from
> > > > Context.
> > > > > > We
> > > > > > > > > could
> > > > > > > > > > > > then decouple the interface/implementation of
> > GPUManager
> > > > and
> > > > > > > Public
> > > > > > > > > > > > API.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > It sounds fine to initially start with GPU specific
> > > > support
> > > > > > and
> > > > > > > > > think
> > > > > > > > > > > > about
> > > > > > > > > > > > > generalizing this once we better understand the
> > space.
> > > > > > > > > > > > >
> > > > > > > > > > > > > About the implementation suggested in FLIP-108:
> > > > > > > > > > > > > - Can we somehow keep this out of the TaskManager
> > > > services?
> > > > > > > > > Anything
> > > > > > > > > > we
> > > > > > > > > > > > > have to pull through all layers of the TM makes the
> > TM
> > > > > > > components
> > > > > > > > > yet
> > > > > > > > > > > > more
> > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - What parts need information about this?
> > > > > > > > > > > > > -> do the slot profiles need information about the
> > GPU?
> > > > > > > > > > > > > -> Can the GPU Manager be a "self contained" thing
> > that
> > > > > > simply
> > > > > > > > > takes
> > > > > > > > > > > > > the configuration, and then abstracts everything
> > > > > internally?
> > > > > > > > > > Operators
> > > > > > > > > > > > can
> > > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're right,
> > I'll add
> > > > > > them
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > Regarding the general extended resource
> mechanism,
> > I
> > > > > second
> > > > > > > > > > Xintong's
> > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > - It's better to leverage ResourceProfile and
> > > > > ResourceSpec
> > > > > > > > after
> > > > > > > > > we
> > > > > > > > > > > > > > supporting fine-grained GPU scheduling. As a
> first
> > step
> > > > > > > > > proposal, I
> > > > > > > > > > > > > > prefer to not include it in the scope of this
> FLIP.
> > > > > > > > > > > > > > - Regarding the "Extended Resource Manager", if I
> > > > > > understand
> > > > > > > > > > > > > > correctly, it just a code refactoring atm, we
> could
> > > > > extract
> > > > > > > the
> > > > > > > > > > > > > > open/close/allocateExtendResources of GPUManager
> to
> > > > that
> > > > > > > > > > interface. If
> > > > > > > > > > > > > > that is the case, +1 to do it during
> > implementation.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > As Xintong said, we looked into how Spark
> supports
> > a
> > > > > > general
> > > > > > > > > > "Custom
> > > > > > > > > > > > > > Resource Scheduling" before and decided to
> > introduce a
> > > > > > common
> > > > > > > > > > resource
> > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > >
> > > > > > > > >
> > > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > to make it more extensible. I think the
> "resource"
> > is a
> > > > > > > proper
> > > > > > > > > > level
> > > > > > > > > > > > > > to contain all the configs of extended resources.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo Huang <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > There is no doubt that GPU resource management
> > > > support
> > > > > > will
> > > > > > > > > > greatly
> > > > > > > > > > > > > > > facilitate the development of AI-related
> > applications
> > > > > by
> > > > > > > > > PyFlink
> > > > > > > > > > > > users.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I have only one comment about this wiki:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regarding the names of several GPU
> > configurations, I
> > > > > > think
> > > > > > > it
> > > > > > > > > is
> > > > > > > > > > > > better
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > delete the resource field makes it consistent
> > with
> > > > the
> > > > > > > names
> > > > > > > > of
> > > > > > > > > > other
> > > > > > > > > > > > > > > resource-related configurations in
> > TaskManagerOption.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > e.g.
> > taskmanager.resource.gpu.discovery-script.path
> > > > ->
> > > > > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song <[hidden email]>
> > 于2020年3月4日周三
> > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also had an
> > offline
> > > > > > > discussion
> > > > > > > > > > about
> > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > the "GPU Support" as some general "Extended
> > > > Resource
> > > > > > > > > Support".
> > > > > > > > > > We
> > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > supporting extended resources in a general
> > > > mechanism
> > > > > is
> > > > > > > > > > definitely
> > > > > > > > > > > > a
> > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > and extensible way. The reason we propose
> this
> > FLIP
> > > > > > > > narrowing
> > > > > > > > > > its
> > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > down to GPU alone, is mainly for the concern
> on
> > > > extra
> > > > > > > > efforts
> > > > > > > > > > and
> > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > capacity needed for a general mechanism.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > To come up with a well design on a general
> > extended
> > > > > > > > resource
> > > > > > > > > > > > management
> > > > > > > > > > > > > > > > mechanism, we would need to investigate more
> > on how
> > > > > > > people
> > > > > > > > > use
> > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > kind of resources in practice. For GPU, we
> > learnt
> > > > > such
> > > > > > > > > > knowledge
> > > > > > > > > > > > from
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > experts, Becket and his team members. But for
> > FPGA,
> > > > > or
> > > > > > > > other
> > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > extended resources, we don't have such
> > convenient
> > > > > > > > information
> > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > making the investigation requires more
> efforts,
> > > > > which I
> > > > > > > > tend
> > > > > > > > > to
> > > > > > > > > > > > think
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On the other hand, we also looked into how
> > Spark
> > > > > > > supports a
> > > > > > > > > > general
> > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > Resource Scheduling". Assuming we want to
> have
> > a
> > > > > > similar
> > > > > > > > > > general
> > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > resource mechanism in the future, we believe
> > that
> > > > the
> > > > > > > > current
> > > > > > > > > > GPU
> > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > design can be easily extended, in an
> > incremental
> > > > way
> > > > > > > > without
> > > > > > > > > > too
> > > > > > > > > > > > many
> > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > - The most important part is probably user
> > > > > interfaces.
> > > > > > > > Spark
> > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > configuration options to define the amount,
> > > > discovery
> > > > > > > > script
> > > > > > > > > > and
> > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > k8s) in a per resource type bias [1], which
> is
> > very
> > > > > > > similar
> > > > > > > > > to
> > > > > > > > > > > > what
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > proposed in this FLIP. I think it's not
> > necessary
> > > > to
> > > > > > > expose
> > > > > > > > > > > > config
> > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > in the general way atm, since we do not have
> > > > supports
> > > > > > for
> > > > > > > > > other
> > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > types now. If later we decided to have per
> > resource
> > > > > > type
> > > > > > > > > config
> > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > can have backwards compatibility on the
> current
> > > > > > proposed
> > > > > > > > > > options
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > - For the GPU Manager, if later needed we can
> > > > change
> > > > > it
> > > > > > > to
> > > > > > > > a
> > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > Resource Manager" (or whatever it is called).
> > That
> > > > > > should
> > > > > > > > be
> > > > > > > > > a
> > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > > - For ResourceProfile and ResourceSpec, there
> > are
> > > > > > already
> > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > general extended resource. We can of course
> > > > leverage
> > > > > > them
> > > > > > > > > when
> > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > fine grained GPU scheduling. That is also not
> > in
> > > > the
> > > > > > > scope
> > > > > > > > of
> > > > > > > > > > > > this
> > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > step proposal, and would require FLIP-56 to
> be
> > > > > finished
> > > > > > > > > first.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > To summary up, I agree with Becket that have
> a
> > > > > separate
> > > > > > > > FLIP
> > > > > > > > > > for
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > general extended resource mechanism, and keep
> > it in
> > > > > > mind
> > > > > > > > when
> > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket Qin <
> > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > That's a good point, Stephan. It makes
> total
> > > > sense
> > > > > to
> > > > > > > > > > generalize
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > resource management to support custom
> > resources.
> > > > > > Having
> > > > > > > > > that
> > > > > > > > > > > > allows
> > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > to add new resources by themselves. The
> > general
> > > > > > > resource
> > > > > > > > > > > > management
> > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 1. The custom resource type definition. It
> is
> > > > > > supported
> > > > > > > > by
> > > > > > > > > > the
> > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > resources in ResourceProfile and
> > ResourceSpec.
> > > > This
> > > > > > > will
> > > > > > > > > > likely
> > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2. The custom resource allocation logic,
> > i.e. how
> > > > > to
> > > > > > > > assign
> > > > > > > > > > the
> > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > to different tasks, operators, and so on.
> > This
> > > > may
> > > > > > > > require
> > > > > > > > > > two
> > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > a. Subtask level - make sure the subtasks
> > are put
> > > > > > into
> > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > It is done by the global RM and is not
> > > > customizable
> > > > > > > right
> > > > > > > > > > now.
> > > > > > > > > > > > > > > > > b. Operator level - map the exact resource
> > to the
> > > > > > > > operators
> > > > > > > > > > > > in
> > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for operator B.
> > This
> > > > > step
> > > > > > > is
> > > > > > > > > > needed
> > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > the global RM does not distinguish
> individual
> > > > > > resources
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > same
> > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > It is true for memory, but not for GPU.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The GPU manager is designed to do 2.b here.
> > So it
> > > > > > > should
> > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > physical GPU information and bind/match
> them
> > to
> > > > > each
> > > > > > > > > > operators.
> > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > general will fill in the missing piece to
> > support
> > > > > > > custom
> > > > > > > > > > resource
> > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > definition. But I'd avoid calling it a
> > "External
> > > > > > > Resource
> > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > confusion with RM, maybe something like
> > "Operator
> > > > > > > > Resource
> > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > be more accurate. So for each resource type
> > users
> > > > > can
> > > > > > > > have
> > > > > > > > > an
> > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > "Operator Resource Assigner" in the TM. For
> > > > memory,
> > > > > > > users
> > > > > > > > > > don't
> > > > > > > > > > > > need
> > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > but for other extended resources, users may
> > need
> > > > > > that.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Personally I think a pluggable "Operator
> > Resource
> > > > > > > > Assigner"
> > > > > > > > > > is
> > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > in this FLIP. But I am also OK with having
> > that
> > > > in
> > > > > a
> > > > > > > > > separate
> > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > the interface between the "Operator
> Resource
> > > > > > Assigner"
> > > > > > > > and
> > > > > > > > > > > > operator
> > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > take a while to settle down if we want to
> > make it
> > > > > > > > generic.
> > > > > > > > > > But I
> > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > implementation should take this future work
> > into
> > > > > > > > > > consideration so
> > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > don't need to break backwards compatibility
> > once
> > > > we
> > > > > > > have
> > > > > > > > > > that.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM Stephan
> Ewen
> > <
> > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thank you for writing this FLIP.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I cannot really give much input into the
> > > > > mechanics
> > > > > > of
> > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > and GPU allocation, as I have no
> experience
> > > > with
> > > > > > > that.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > One thought I had when reading the
> > proposal is
> > > > if
> > > > > > it
> > > > > > > > > makes
> > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > the "GPU Manager" as an "External
> Resource
> > > > > > Manager",
> > > > > > > > and
> > > > > > > > > > GPU
> > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > The way I understand the ResourceProfile
> > and
> > > > > > > > > ResourceSpec,
> > > > > > > > > > > > that is
> > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > It has the advantage that it looks more
> > > > > extensible.
> > > > > > > > Maybe
> > > > > > > > > > > > there is
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU
> > Resource,
> > > > and
> > > > > > FPGA
> > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM Becket
> Qin <
> > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU
> resource
> > > > > > management
> > > > > > > > > > support
> > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > for machine learning use cases.
> Actually
> > it
> > > > is
> > > > > > one
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > question from the users who are
> > interested in
> > > > > > using
> > > > > > > > > Flink
> > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Some quick comments / questions to the
> > wiki.
> > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API should probably
> > also
> > > > be
> > > > > > > > > > mentioned in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > 2. Is the data structure that holds GPU
> > info
> > > > > > also a
> > > > > > > > > > public
> > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM Xintong
> > Song
> > > > <
> > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks for drafting the FLIP and
> > kicking
> > > > off
> > > > > > the
> > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Big +1 for this feature. Supporting
> > using
> > > > of
> > > > > > GPU
> > > > > > > in
> > > > > > > > > > Flink
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > especially for the ML scenarios.
> > > > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki doc and
> it
> > > > looks
> > > > > > good
> > > > > > > > to
> > > > > > > > > > me. I
> > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > very good first step for Flink's GPU
> > > > > supports.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM
> Yangze
> > Guo
> > > > <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > We would like to start a discussion
> > > > thread
> > > > > on
> > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > Add
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > This FLIP mainly discusses the
> > following
> > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > - Enable user to configure how many
> > GPUs
> > > > > in a
> > > > > > > > task
> > > > > > > > > > > > executor
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > forward such requirements to the
> > external
> > > > > > > > resource
> > > > > > > > > > > > managers
> > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos setups).
> > > > > > > > > > > > > > > > > > > > > - Provide information of available
> > GPU
> > > > > > > resources
> > > > > > > > to
> > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Key changes proposed in the FLIP
> are
> > as
> > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > - Forward GPU resource requirements
> > to
> > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > - Introduce GPUManager as one of
> the
> > task
> > > > > > > manager
> > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > and expose GPU resource information
> > to
> > > > the
> > > > > > > > context
> > > > > > > > > of
> > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > - Introduce the default script for
> > GPU
> > > > > > > discovery,
> > > > > > > > > in
> > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > the privilege mode to help user to
> > > > achieve
> > > > > > > > > > worker-level
> > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Please find more details in the
> FLIP
> > wiki
> > > > > > > > document
> > > > > > > > > > [1].
> > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
> >
>

Xintong Song

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Thanks for updating the FLIP, Yangze.

I agree with Till that we probably want to separate the K8s/Yarn decorator
calls. Users can still configure one driver class, and we can use
`instanceof` to check whether the driver implemented K8s/Yarn specific
interfaces.

Moreover, I'm not sure about exposing entire `ContainerRequest` / `Pod`
(`AbstractKubernetesStepDecorator` directly manipulates on `Pod`) to user
codes. It gives more access to user codes than needed for defining external
resource, which might cause problems. Instead, I would suggest to have
interface like `Map<String key, String value>
getYarn/KubernetesExternalResource()` and assemble them into
`ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.

Thank you~

Xintong Song

On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <[hidden email]> wrote:

> Hi everyone,
>
> I'm a bit late to the party. I think the current proposal looks good.
>
> Concerning the ExternalResourceDriver interface defined in the FLIP [1], I
> would suggest to not include the decorator calls for Kubernetes and Yarn in
> the base interface. Instead I would suggest to segregate the deployment
> specific decorator calls into separate interfaces. That way an
> ExternalResourceDriver does not have to support all deployments from the
> very beginning. Moreover, some resources might not be supported by a
> specific deployment target and the natural way to express this would be to
> not implement the respective deployment specific interface.
>
> Moreover, having void
> addExternalResourceToRequest(AMRMClient.ContainerRequest containerRequest)
> in the ExternalResourceDriver interface would require Hadoop on Flink's
> classpath whenever the external resource driver is being used.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
>
> Cheers,
> Till
>
> On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <[hidden email]> wrote:
>
> > Nice, thanks a lot!
> >
> > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <[hidden email]> wrote:
> >
> > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > >
> > > I've updated the FLIP accordingly. I do not add a
> > > ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
> > > which takes the responsibility of all relevant operations on both RM
> > > and TM sides.
> > > After a rethink about decoupling the management of external resources
> > > from TaskExecutor, I think we could do the same thing on the
> > > ResourceManager side. We do not need to add a specific allocation
> > > logic to the ResourceManager each time we add a specific external
> > > resource.
> > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > containerRequest.
> > > - For Kubenetes, ExternalResourceDriver could provide a decorator for
> > > the TM pod.
> > >
> > > In this way, just like MetricReporter, we allow users to define their
> > > custom ExternalResourceDriver. It is more extensible and fits the
> > > separation of concerns. For more details, please take a look at [1].
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <[hidden email]> wrote:
> > > >
> > > > This sounds good to go ahead from my side.
> > > >
> > > > I like the approach that Becket suggested - in that case the core
> > > > abstraction that everyone would need to understand would be "external
> > > > resource allocation" and the "ResourceInfoProvider", and the GPU
> > specific
> > > > code would be a specific implementation only known to that component
> > that
> > > > allocates the external resource. That fits the separation of concerns
> > > well.
> > > >
> > > > I also understand that it should not be over-engineered in the first
> > > > version, so some simplification makes sense, and then gradually
> expand
> > > from
> > > > there.
> > > >
> > > > So +1 to go ahead with what was suggested above (Xintong / Becket)
> from
> > > my
> > > > side.
> > > >
> > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <[hidden email]>
> > > wrote:
> > > >
> > > > > Thanks for the comments, Stephan & Becket.
> > > > >
> > > > > @Stephan
> > > > >
> > > > > I see your concern, and I completely agree with you that we should
> > > first
> > > > > think about the "library" / "plugin" / "extension" style if
> possible.
> > > > >
> > > > > If GPUs are sliced and assigned during scheduling, there may be
> > reason,
> > > > > > although it looks that it would belong to the slot then. Is that
> > > what we
> > > > > > are doing here?
> > > > >
> > > > >
> > > > > In the current proposal, we do not have the GPUs sliced and
> assigned
> > to
> > > > > slots, because it could be problematic without dynamic slot
> > allocation.
> > > > > E.g., the number of GPUs might not be evenly divisible by the
> number
> > of
> > > > > slots.
> > > > >
> > > > > I think it makes sense to eventually have the GPUs assigned to
> slots.
> > > Even
> > > > > then, we might still need a TM level GPUManager (or
> ResourceProvider
> > > like
> > > > > Becket suggested). For memory, in each slot we can simply request
> the
> > > > > amount of memory, leaving it to JVM / OS to decide which memory
> > > (address)
> > > > > should be assigned. For GPU, and potentially other resources like
> > > FPGA, we
> > > > > need to explicitly specify which GPU (index) should be used.
> > > Therefore, we
> > > > > need some component at the TM level to coordinate which slot uses
> > which
> > > > > GPU.
> > > > >
> > > > > IMO, unless we say Flink will not support slot-level GPU slicing at
> > > least
> > > > > in the foreseeable future, I don't see a good way to avoid touching
> > > the TM
> > > > > core. To that end, I think Becket's suggestion points to a good
> > > direction,
> > > > > that supports more features (GPU, FPGA, etc.) with less coupling to
> > > the TM
> > > > > core (only needs to understand the general interfaces). The
> detailed
> > > > > implementation for specific resource types can even be encapsulated
> > as
> > > a
> > > > > library.
> > > > >
> > > > > @Becket
> > > > >
> > > > > Thanks for sharing your thought on the final state. Despite the
> > > details how
> > > > > the interfaces should look like, I think this is a really good
> > > abstraction
> > > > > for supporting general resource types.
> > > > >
> > > > > I'd like to further clarify that, the following three things are
> all
> > > that
> > > > > the "Flink core" needs to understand.
> > > > >
> > > > > - The *amount* of resource, for scheduling. Actually, we already
> > > have
> > > > > the Resource class in ResourceProfile and ResourceSpec for
> > extended
> > > > > resource. It's just not really used.
> > > > > - The *info*, that Flink provides to the operators / user codes.
> > > > > - The *provider*, which generates the info based on the amount.
> > > > >
> > > > > The "core" does not need to understand the specific implementation
> > > details
> > > > > of the above three. They can even be implemented in a 3rd-party
> > > library.
> > > > > Similar to how we allow users to define their custom
> MetricReporter.
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <[hidden email]>
> > > wrote:
> > > > >
> > > > > > Thanks for the comment, Stephan.
> > > > > >
> > > > > > - If everything becomes a "core feature", it will make the
> > project
> > > hard
> > > > > > > to develop in the future. Thinking "library" / "plugin" /
> > > "extension"
> > > > > > style
> > > > > > > where possible helps.
> > > > > >
> > > > > >
> > > > > > Completely agree. It is much more important to design a mechanism
> > > than
> > > > > > focusing on a specific case. Here is what I am thinking to fully
> > > support
> > > > > > custom resource management:
> > > > > > 1. On the JM / RM side, use ResourceProfile and ResourceSpec to
> > > define
> > > > > the
> > > > > > resource and the amount required. They will be used to find
> > suitable
> > > TMs
> > > > > > slots to run the tasks. At this point, the resources are only
> > > measured by
> > > > > > amount, i.e. they do not have individual ID.
> > > > > >
> > > > > > 2. On the TM side, have something like *"ResourceInfoProvider"*
> to
> > > > > identify
> > > > > > and provides the detail information of the individual resource,
> > e.g.
> > > GPU
> > > > > > ID.. It is important because the operator may have to explicitly
> > > interact
> > > > > > with the physical resource it uses. The ResourceInfoProvider
> might
> > > look
> > > > > > like something below.
> > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId opId,
> > > > > > ResourceProfile resourceProfile);
> > > > > > }
> > > > > >
> > > > > > - There could be several "*ResourceInfoProvider*" configured on
> the
> > > TM to
> > > > > > retrieve the information for different resources.
> > > > > > - The TM will be responsible to assign those individual resources
> > to
> > > each
> > > > > > operator according to their requested amount.
> > > > > > - The operators will be able to get the ResourceInfo from their
> > > > > > RuntimeContext.
> > > > > >
> > > > > > If we agree this is a reasonable final state. We can adapt the
> > > current
> > > > > FLIP
> > > > > > to it. In fact it does not sound a big change to me. All the
> > proposed
> > > > > > configuration can be as is, it is just that Flink itself won't
> care
> > > about
> > > > > > them, instead a GPUInfoProviver implementing the
> > ResourceInfoProvider
> > > > > will
> > > > > > use them.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <[hidden email]>
> > > wrote:
> > > > > >
> > > > > > > Hi all!
> > > > > > >
> > > > > > > The main point I wanted to throw into the discussion is the
> > > following:
> > > > > > > - With more and more use cases, more and more tools go into
> > Flink
> > > > > > > - If everything becomes a "core feature", it will make the
> > > project
> > > > > hard
> > > > > > > to develop in the future. Thinking "library" / "plugin" /
> > > "extension"
> > > > > > style
> > > > > > > where possible helps.
> > > > > > >
> > > > > > > - A good thought experiment is always: How many future
> > developers
> > > > > have
> > > > > > to
> > > > > > > interact with this code (and possibly understand it partially),
> > > even if
> > > > > > the
> > > > > > > features they touch have nothing to do with GPU support. If
> many
> > > > > > > contributors to unrelated features will have to touch it and
> > > understand
> > > > > > it,
> > > > > > > then let's think if there is a different solution. Maybe there
> is
> > > not,
> > > > > > but
> > > > > > > then we should be sure why.
> > > > > > >
> > > > > > > - That led me to raising this issue: If the GPU manager
> > becomes a
> > > > > core
> > > > > > > service in the TaskManager, Environment, RuntimeContext, etc.
> > then
> > > > > > everyone
> > > > > > > developing TM and streaming tasks need to understand the GPU
> > > manager.
> > > > > > That
> > > > > > > seems oddly specific, is my impression.
> > > > > > >
> > > > > > > Access to configuration seems not the right reason to do that.
> We
> > > > > should
> > > > > > > expose the Flink configuration from the RuntimeContext anyways.
> > > > > > >
> > > > > > > If GPUs are sliced and assigned during scheduling, there may be
> > > reason,
> > > > > > > although it looks that it would belong to the slot then. Is
> that
> > > what
> > > > > we
> > > > > > > are doing here?
> > > > > > >
> > > > > > > Best,
> > > > > > > Stephan
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks for the feedback, Becket.
> > > > > > > >
> > > > > > > > IMO, eventually an operator should only see info of GPUs that
> > are
> > > > > > > dedicated
> > > > > > > > for it, instead of all GPUs on the machine/container in the
> > > current
> > > > > > > design.
> > > > > > > > It does not make sense to let the user who writes a UDF to
> > worry
> > > > > about
> > > > > > > > coordination among multiple operators running on the same
> > > machine.
> > > > > And
> > > > > > if
> > > > > > > > we want to limit the GPU info an operator sees, we should not
> > > let the
> > > > > > > > operator to instantiate GPUManager, which means we have to
> > expose
> > > > > > > something
> > > > > > > > through runtime context, either GPU info or some kind of
> > limited
> > > > > access
> > > > > > > to
> > > > > > > > the GPUManager.
> > > > > > > >
> > > > > > > > Thank you~
> > > > > > > >
> > > > > > > > Xintong Song
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > [hidden email]
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > It probably make sense for us to first agree on the final
> > > state.
> > > > > More
> > > > > > > > > specifically, will the resource info be exposed through
> > runtime
> > > > > > context
> > > > > > > > > eventually?
> > > > > > > > >
> > > > > > > > > If that is the final state and we have a seamless migration
> > > story
> > > > > > from
> > > > > > > > this
> > > > > > > > > FLIP to that final state, Personally I think it is OK to
> > > expose the
> > > > > > GPU
> > > > > > > > > info in the runtime context.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > >
> > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > @Yangze,
> > > > > > > > > > I think what Stephan means (@Stephan, please correct me
> if
> > > I'm
> > > > > > wrong)
> > > > > > > > is
> > > > > > > > > > that, we might not need to hold and maintain the
> GPUManager
> > > as a
> > > > > > > > service
> > > > > > > > > in
> > > > > > > > > > TaskManagerServices or RuntimeContext. An alternative is
> to
> > > > > create
> > > > > > /
> > > > > > > > > > retrieve the GPUManager only in the operators that need
> it,
> > > e.g.,
> > > > > > > with
> > > > > > > > a
> > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > >
> > > > > > > > > > @Stephan,
> > > > > > > > > > I agree with you on excluding GPUManager from
> > > > > TaskManagerServices.
> > > > > > > > > >
> > > > > > > > > > - For the first step, where we provide unified
> TM-level
> > > GPU
> > > > > > > > > information
> > > > > > > > > > to all operators, it should be fine to have operators
> > > access /
> > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > - In future, we might have some more fine-grained GPU
> > > > > > management,
> > > > > > > > > where
> > > > > > > > > > we need to maintain GPUManager as a service and put
> GPU
> > > info
> > > > > in
> > > > > > > slot
> > > > > > > > > > profiles. But at least for now it's not necessary to
> > > introduce
> > > > > > > such
> > > > > > > > > > complexity.
> > > > > > > > > >
> > > > > > > > > > However, I have some concerns on excluding GPUManager
> from
> > > > > > > > RuntimeContext
> > > > > > > > > > and let operators access it directly.
> > > > > > > > > >
> > > > > > > > > > - Configurations needed for creating the GPUManager is
> > not
> > > > > > always
> > > > > > > > > > available for operators.
> > > > > > > > > > - If later we want to have fine-grained control over
> GPU
> > > > > (e.g.,
> > > > > > > > > > operators in each slot can only see GPUs reserved for
> > that
> > > > > > slot),
> > > > > > > > the
> > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > >
> > > > > > > > > > I would suggest to wrap the GPUManager behind
> > RuntimeContext
> > > and
> > > > > > only
> > > > > > > > > > expose the GPUInfo to users. For now, we can declare a
> > method
> > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a default
> definition
> > > that
> > > > > > > calls
> > > > > > > > > > `GPUManager.get()` to get the lazily-created GPUManager.
> If
> > > later
> > > > > > we
> > > > > > > > want
> > > > > > > > > > to create / retrieve GPUManager in a different way, we
> can
> > > simply
> > > > > > > > change
> > > > > > > > > > how `getGPUInfo` is implemented, without needing to
> change
> > > any
> > > > > > public
> > > > > > > > > > interfaces.
> > > > > > > > > >
> > > > > > > > > > Thank you~
> > > > > > > > > >
> > > > > > > > > > Xintong Song
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > @Shephan
> > > > > > > > > > > Do you mean Minicluster? Yes, it makes sense to share
> the
> > > GPU
> > > > > > > Manager
> > > > > > > > > > > in such scenario.
> > > > > > > > > > > If that's what you worry about, I'm +1 for holding
> > > > > > > > > > > GPUManager(ExternalResourceManagers) in TaskExecutor
> > > instead of
> > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > >
> > > > > > > > > > > Regarding the RuntimeContext/FunctionContext, it just
> > > holds the
> > > > > > GPU
> > > > > > > > > > > info instead of the GPU Manager. AFAIK, it's the only
> > > place we
> > > > > > > could
> > > > > > > > > > > pass GPU info to the RichFunction/UserDefinedFunction.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > [hidden email]
> > > > > wrote
> > > > > > > > ----
> > > > > > > > > > > >
> > > > > > > > > > > > > > Can we somehow keep this out of the TaskManager
> > > services
> > > > > > > > > > > > > I fear that we could not. IMO, the GPUManager(or
> > > > > > > > > > > > > ExternalServicesManagers in future) is conceptually
> > > one of
> > > > > > the
> > > > > > > > task
> > > > > > > > > > > > > manager services, just like MemoryManager before
> > 1.10.
> > > > > > > > > > > > > - It maintains/holds the GPU resource at TM level
> and
> > > all
> > > > > of
> > > > > > > the
> > > > > > > > > > > > > operators allocate the GPU resources from it. So,
> it
> > > should
> > > > > > be
> > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > - We could add a collection called
> > > ExternalResourceManagers
> > > > > > to
> > > > > > > > hold
> > > > > > > > > > > > > all managers of other external resources in the
> > future.
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Can you help me understand why this needs the
> addition
> > in
> > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > Are you worried about the case when multiple Task
> > > Executors
> > > > > run
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > same
> > > > > > > > > > > > JVM? That's not common, but wouldn't it actually be
> > good
> > > in
> > > > > > that
> > > > > > > > case
> > > > > > > > > > to
> > > > > > > > > > > > share the GPU Manager, given that the GPU is shared?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Stephan
> > > > > > > > > > > >
> > > > > > > > > > > > ---------------------------
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > > > In this FLIP, operators need the information. Thus,
> > we
> > > > > expose
> > > > > > > GPU
> > > > > > > > > > > > > information to the RuntimeContext/FunctionContext.
> > The
> > > slot
> > > > > > > > profile
> > > > > > > > > > is
> > > > > > > > > > > > > not aware of GPU resources as GPU is TM level
> > resource
> > > now.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Can the GPU Manager be a "self contained" thing
> > that
> > > > > simply
> > > > > > > > takes
> > > > > > > > > > the
> > > > > > > > > > > > > configuration, and then abstracts everything
> > > internally?
> > > > > > > > > > > > > Yes, we just pass the path/args of the discover
> > script
> > > and
> > > > > > how
> > > > > > > > many
> > > > > > > > > > > > > GPUs per TM to it. It takes the responsibility to
> get
> > > the
> > > > > GPU
> > > > > > > > > > > > > information and expose them to the
> > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > of
> > > > > > > > > > > > > Operators. Meanwhile, we'd better not allow
> operators
> > > to
> > > > > > > directly
> > > > > > > > > > > > > access GPUManager, it should get what they want
> from
> > > > > Context.
> > > > > > > We
> > > > > > > > > > could
> > > > > > > > > > > > > then decouple the interface/implementation of
> > > GPUManager
> > > > > and
> > > > > > > > Public
> > > > > > > > > > > > > API.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It sounds fine to initially start with GPU
> specific
> > > > > support
> > > > > > > and
> > > > > > > > > > think
> > > > > > > > > > > > > about
> > > > > > > > > > > > > > generalizing this once we better understand the
> > > space.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > About the implementation suggested in FLIP-108:
> > > > > > > > > > > > > > - Can we somehow keep this out of the TaskManager
> > > > > services?
> > > > > > > > > > Anything
> > > > > > > > > > > we
> > > > > > > > > > > > > > have to pull through all layers of the TM makes
> the
> > > TM
> > > > > > > > components
> > > > > > > > > > yet
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - What parts need information about this?
> > > > > > > > > > > > > > -> do the slot profiles need information about
> the
> > > GPU?
> > > > > > > > > > > > > > -> Can the GPU Manager be a "self contained"
> thing
> > > that
> > > > > > > simply
> > > > > > > > > > takes
> > > > > > > > > > > > > > the configuration, and then abstracts everything
> > > > > > internally?
> > > > > > > > > > > Operators
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <
> > > > > > > [hidden email]>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're right,
> > > I'll add
> > > > > > > them
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > Regarding the general extended resource
> > mechanism,
> > > I
> > > > > > second
> > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > - It's better to leverage ResourceProfile and
> > > > > > ResourceSpec
> > > > > > > > > after
> > > > > > > > > > we
> > > > > > > > > > > > > > > supporting fine-grained GPU scheduling. As a
> > first
> > > step
> > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > prefer to not include it in the scope of this
> > FLIP.
> > > > > > > > > > > > > > > - Regarding the "Extended Resource Manager",
> if I
> > > > > > > understand
> > > > > > > > > > > > > > > correctly, it just a code refactoring atm, we
> > could
> > > > > > extract
> > > > > > > > the
> > > > > > > > > > > > > > > open/close/allocateExtendResources of
> GPUManager
> > to
> > > > > that
> > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > that is the case, +1 to do it during
> > > implementation.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > As Xintong said, we looked into how Spark
> > supports
> > > a
> > > > > > > general
> > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > Resource Scheduling" before and decided to
> > > introduce a
> > > > > > > common
> > > > > > > > > > > resource
> > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > >
> > > > > > > > > >
> > > > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > to make it more extensible. I think the
> > "resource"
> > > is a
> > > > > > > > proper
> > > > > > > > > > > level
> > > > > > > > > > > > > > > to contain all the configs of extended
> resources.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo Huang <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > There is no doubt that GPU resource
> management
> > > > > support
> > > > > > > will
> > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > facilitate the development of AI-related
> > > applications
> > > > > > by
> > > > > > > > > > PyFlink
> > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I have only one comment about this wiki:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regarding the names of several GPU
> > > configurations, I
> > > > > > > think
> > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > > > better
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > delete the resource field makes it consistent
> > > with
> > > > > the
> > > > > > > > names
> > > > > > > > > of
> > > > > > > > > > > other
> > > > > > > > > > > > > > > > resource-related configurations in
> > > TaskManagerOption.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > e.g.
> > > taskmanager.resource.gpu.discovery-script.path
> > > > > ->
> > > > > > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song <[hidden email]>
> > > 于2020年3月4日周三
> > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also had an
> > > offline
> > > > > > > > discussion
> > > > > > > > > > > about
> > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > the "GPU Support" as some general "Extended
> > > > > Resource
> > > > > > > > > > Support".
> > > > > > > > > > > We
> > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > supporting extended resources in a general
> > > > > mechanism
> > > > > > is
> > > > > > > > > > > definitely
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > and extensible way. The reason we propose
> > this
> > > FLIP
> > > > > > > > > narrowing
> > > > > > > > > > > its
> > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > down to GPU alone, is mainly for the
> concern
> > on
> > > > > extra
> > > > > > > > > efforts
> > > > > > > > > > > and
> > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > capacity needed for a general mechanism.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > To come up with a well design on a general
> > > extended
> > > > > > > > > resource
> > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > mechanism, we would need to investigate
> more
> > > on how
> > > > > > > > people
> > > > > > > > > > use
> > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > kind of resources in practice. For GPU, we
> > > learnt
> > > > > > such
> > > > > > > > > > > knowledge
> > > > > > > > > > > > > from
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > experts, Becket and his team members. But
> for
> > > FPGA,
> > > > > > or
> > > > > > > > > other
> > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > extended resources, we don't have such
> > > convenient
> > > > > > > > > information
> > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > making the investigation requires more
> > efforts,
> > > > > > which I
> > > > > > > > > tend
> > > > > > > > > > to
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On the other hand, we also looked into how
> > > Spark
> > > > > > > > supports a
> > > > > > > > > > > general
> > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > Resource Scheduling". Assuming we want to
> > have
> > > a
> > > > > > > similar
> > > > > > > > > > > general
> > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > resource mechanism in the future, we
> believe
> > > that
> > > > > the
> > > > > > > > > current
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > design can be easily extended, in an
> > > incremental
> > > > > way
> > > > > > > > > without
> > > > > > > > > > > too
> > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - The most important part is probably user
> > > > > > interfaces.
> > > > > > > > > Spark
> > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > configuration options to define the amount,
> > > > > discovery
> > > > > > > > > script
> > > > > > > > > > > and
> > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > k8s) in a per resource type bias [1], which
> > is
> > > very
> > > > > > > > similar
> > > > > > > > > > to
> > > > > > > > > > > > > what
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > proposed in this FLIP. I think it's not
> > > necessary
> > > > > to
> > > > > > > > expose
> > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > in the general way atm, since we do not
> have
> > > > > supports
> > > > > > > for
> > > > > > > > > > other
> > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > types now. If later we decided to have per
> > > resource
> > > > > > > type
> > > > > > > > > > config
> > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > can have backwards compatibility on the
> > current
> > > > > > > proposed
> > > > > > > > > > > options
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > - For the GPU Manager, if later needed we
> can
> > > > > change
> > > > > > it
> > > > > > > > to
> > > > > > > > > a
> > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > Resource Manager" (or whatever it is
> called).
> > > That
> > > > > > > should
> > > > > > > > > be
> > > > > > > > > > a
> > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > > > - For ResourceProfile and ResourceSpec,
> there
> > > are
> > > > > > > already
> > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > general extended resource. We can of course
> > > > > leverage
> > > > > > > them
> > > > > > > > > > when
> > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > fine grained GPU scheduling. That is also
> not
> > > in
> > > > > the
> > > > > > > > scope
> > > > > > > > > of
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > step proposal, and would require FLIP-56 to
> > be
> > > > > > finished
> > > > > > > > > > first.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > To summary up, I agree with Becket that
> have
> > a
> > > > > > separate
> > > > > > > > > FLIP
> > > > > > > > > > > for
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > general extended resource mechanism, and
> keep
> > > it in
> > > > > > > mind
> > > > > > > > > when
> > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket Qin <
> > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > That's a good point, Stephan. It makes
> > total
> > > > > sense
> > > > > > to
> > > > > > > > > > > generalize
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > resource management to support custom
> > > resources.
> > > > > > > Having
> > > > > > > > > > that
> > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > to add new resources by themselves. The
> > > general
> > > > > > > > resource
> > > > > > > > > > > > > management
> > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 1. The custom resource type definition.
> It
> > is
> > > > > > > supported
> > > > > > > > > by
> > > > > > > > > > > the
> > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > resources in ResourceProfile and
> > > ResourceSpec.
> > > > > This
> > > > > > > > will
> > > > > > > > > > > likely
> > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2. The custom resource allocation logic,
> > > i.e. how
> > > > > > to
> > > > > > > > > assign
> > > > > > > > > > > the
> > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > to different tasks, operators, and so on.
> > > This
> > > > > may
> > > > > > > > > require
> > > > > > > > > > > two
> > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > a. Subtask level - make sure the subtasks
> > > are put
> > > > > > > into
> > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > It is done by the global RM and is not
> > > > > customizable
> > > > > > > > right
> > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > b. Operator level - map the exact
> resource
> > > to the
> > > > > > > > > operators
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for operator
> B.
> > > This
> > > > > > step
> > > > > > > > is
> > > > > > > > > > > needed
> > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > the global RM does not distinguish
> > individual
> > > > > > > resources
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > same
> > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > It is true for memory, but not for GPU.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The GPU manager is designed to do 2.b
> here.
> > > So it
> > > > > > > > should
> > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > physical GPU information and bind/match
> > them
> > > to
> > > > > > each
> > > > > > > > > > > operators.
> > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > general will fill in the missing piece to
> > > support
> > > > > > > > custom
> > > > > > > > > > > resource
> > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > definition. But I'd avoid calling it a
> > > "External
> > > > > > > > Resource
> > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > confusion with RM, maybe something like
> > > "Operator
> > > > > > > > > Resource
> > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > be more accurate. So for each resource
> type
> > > users
> > > > > > can
> > > > > > > > > have
> > > > > > > > > > an
> > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > "Operator Resource Assigner" in the TM.
> For
> > > > > memory,
> > > > > > > > users
> > > > > > > > > > > don't
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > but for other extended resources, users
> may
> > > need
> > > > > > > that.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Personally I think a pluggable "Operator
> > > Resource
> > > > > > > > > Assigner"
> > > > > > > > > > > is
> > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > in this FLIP. But I am also OK with
> having
> > > that
> > > > > in
> > > > > > a
> > > > > > > > > > separate
> > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > the interface between the "Operator
> > Resource
> > > > > > > Assigner"
> > > > > > > > > and
> > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > take a while to settle down if we want to
> > > make it
> > > > > > > > > generic.
> > > > > > > > > > > But I
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > implementation should take this future
> work
> > > into
> > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > don't need to break backwards
> compatibility
> > > once
> > > > > we
> > > > > > > > have
> > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM Stephan
> > Ewen
> > > <
> > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thank you for writing this FLIP.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I cannot really give much input into
> the
> > > > > > mechanics
> > > > > > > of
> > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > and GPU allocation, as I have no
> > experience
> > > > > with
> > > > > > > > that.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > One thought I had when reading the
> > > proposal is
> > > > > if
> > > > > > > it
> > > > > > > > > > makes
> > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > the "GPU Manager" as an "External
> > Resource
> > > > > > > Manager",
> > > > > > > > > and
> > > > > > > > > > > GPU
> > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > The way I understand the
> ResourceProfile
> > > and
> > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > It has the advantage that it looks more
> > > > > > extensible.
> > > > > > > > > Maybe
> > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU
> > > Resource,
> > > > > and
> > > > > > > FPGA
> > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM Becket
> > Qin <
> > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU
> > resource
> > > > > > > management
> > > > > > > > > > > support
> > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > for machine learning use cases.
> > Actually
> > > it
> > > > > is
> > > > > > > one
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > question from the users who are
> > > interested in
> > > > > > > using
> > > > > > > > > > Flink
> > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Some quick comments / questions to
> the
> > > wiki.
> > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API should
> probably
> > > also
> > > > > be
> > > > > > > > > > > mentioned in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > 2. Is the data structure that holds
> GPU
> > > info
> > > > > > > also a
> > > > > > > > > > > public
> > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM
> Xintong
> > > Song
> > > > > <
> > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks for drafting the FLIP and
> > > kicking
> > > > > off
> > > > > > > the
> > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Big +1 for this feature. Supporting
> > > using
> > > > > of
> > > > > > > GPU
> > > > > > > > in
> > > > > > > > > > > Flink
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > especially for the ML scenarios.
> > > > > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki doc and
> > it
> > > > > looks
> > > > > > > good
> > > > > > > > > to
> > > > > > > > > > > me. I
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > very good first step for Flink's
> GPU
> > > > > > supports.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM
> > Yangze
> > > Guo
> > > > > <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > We would like to start a
> discussion
> > > > > thread
> > > > > > on
> > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > This FLIP mainly discusses the
> > > following
> > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > - Enable user to configure how
> many
> > > GPUs
> > > > > > in a
> > > > > > > > > task
> > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > forward such requirements to the
> > > external
> > > > > > > > > resource
> > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos setups).
> > > > > > > > > > > > > > > > > > > > > > - Provide information of
> available
> > > GPU
> > > > > > > > resources
> > > > > > > > > to
> > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Key changes proposed in the FLIP
> > are
> > > as
> > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > - Forward GPU resource
> requirements
> > > to
> > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > - Introduce GPUManager as one of
> > the
> > > task
> > > > > > > > manager
> > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > and expose GPU resource
> information
> > > to
> > > > > the
> > > > > > > > > context
> > > > > > > > > > of
> > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > - Introduce the default script
> for
> > > GPU
> > > > > > > > discovery,
> > > > > > > > > > in
> > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > the privilege mode to help user
> to
> > > > > achieve
> > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Please find more details in the
> > FLIP
> > > wiki
> > > > > > > > > document
> > > > > > > > > > > [1].
> > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> > >
> >
>

Yangze Guo

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Thanks for the feedback, @Till and @Xintong.

Regarding separating the interface, I'm also +1 with it.

Regarding the resource allocation interface, true, it's dangerous to
give much access to user codes. Changing the return type to Map<String
key, String/Long value> makes sense to me. AFAIK, it is compatible
with all the first-party supported resources for Yarn/Kubernetes. It
could also free us from the potential dependency issue as well.

Best,
Yangze Guo

On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <[hidden email]> wrote:

>
> Thanks for updating the FLIP, Yangze.
>
> I agree with Till that we probably want to separate the K8s/Yarn decorator
> calls. Users can still configure one driver class, and we can use
> `instanceof` to check whether the driver implemented K8s/Yarn specific
> interfaces.
>
> Moreover, I'm not sure about exposing entire `ContainerRequest` / `Pod`
> (`AbstractKubernetesStepDecorator` directly manipulates on `Pod`) to user
> codes. It gives more access to user codes than needed for defining external
> resource, which might cause problems. Instead, I would suggest to have
> interface like `Map<String key, String value>
> getYarn/KubernetesExternalResource()` and assemble them into
> `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <[hidden email]> wrote:
>
> > Hi everyone,
> >
> > I'm a bit late to the party. I think the current proposal looks good.
> >
> > Concerning the ExternalResourceDriver interface defined in the FLIP [1], I
> > would suggest to not include the decorator calls for Kubernetes and Yarn in
> > the base interface. Instead I would suggest to segregate the deployment
> > specific decorator calls into separate interfaces. That way an
> > ExternalResourceDriver does not have to support all deployments from the
> > very beginning. Moreover, some resources might not be supported by a
> > specific deployment target and the natural way to express this would be to
> > not implement the respective deployment specific interface.
> >
> > Moreover, having void
> > addExternalResourceToRequest(AMRMClient.ContainerRequest containerRequest)
> > in the ExternalResourceDriver interface would require Hadoop on Flink's
> > classpath whenever the external resource driver is being used.
> >
> > [1]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> >
> > Cheers,
> > Till
> >
> > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <[hidden email]> wrote:
> >
> > > Nice, thanks a lot!
> > >
> > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <[hidden email]> wrote:
> > >
> > > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > > >
> > > > I've updated the FLIP accordingly. I do not add a
> > > > ResourceInfoProvider. Instead, I introduce the ExternalResourceDriver,
> > > > which takes the responsibility of all relevant operations on both RM
> > > > and TM sides.
> > > > After a rethink about decoupling the management of external resources
> > > > from TaskExecutor, I think we could do the same thing on the
> > > > ResourceManager side. We do not need to add a specific allocation
> > > > logic to the ResourceManager each time we add a specific external
> > > > resource.
> > > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > > containerRequest.
> > > > - For Kubenetes, ExternalResourceDriver could provide a decorator for
> > > > the TM pod.
> > > >
> > > > In this way, just like MetricReporter, we allow users to define their
> > > > custom ExternalResourceDriver. It is more extensible and fits the
> > > > separation of concerns. For more details, please take a look at [1].
> > > >
> > > > [1]
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <[hidden email]> wrote:
> > > > >
> > > > > This sounds good to go ahead from my side.
> > > > >
> > > > > I like the approach that Becket suggested - in that case the core
> > > > > abstraction that everyone would need to understand would be "external
> > > > > resource allocation" and the "ResourceInfoProvider", and the GPU
> > > specific
> > > > > code would be a specific implementation only known to that component
> > > that
> > > > > allocates the external resource. That fits the separation of concerns
> > > > well.
> > > > >
> > > > > I also understand that it should not be over-engineered in the first
> > > > > version, so some simplification makes sense, and then gradually
> > expand
> > > > from
> > > > > there.
> > > > >
> > > > > So +1 to go ahead with what was suggested above (Xintong / Becket)
> > from
> > > > my
> > > > > side.
> > > > >
> > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <[hidden email]>
> > > > wrote:
> > > > >
> > > > > > Thanks for the comments, Stephan & Becket.
> > > > > >
> > > > > > @Stephan
> > > > > >
> > > > > > I see your concern, and I completely agree with you that we should
> > > > first
> > > > > > think about the "library" / "plugin" / "extension" style if
> > possible.
> > > > > >
> > > > > > If GPUs are sliced and assigned during scheduling, there may be
> > > reason,
> > > > > > > although it looks that it would belong to the slot then. Is that
> > > > what we
> > > > > > > are doing here?
> > > > > >
> > > > > >
> > > > > > In the current proposal, we do not have the GPUs sliced and
> > assigned
> > > to
> > > > > > slots, because it could be problematic without dynamic slot
> > > allocation.
> > > > > > E.g., the number of GPUs might not be evenly divisible by the
> > number
> > > of
> > > > > > slots.
> > > > > >
> > > > > > I think it makes sense to eventually have the GPUs assigned to
> > slots.
> > > > Even
> > > > > > then, we might still need a TM level GPUManager (or
> > ResourceProvider
> > > > like
> > > > > > Becket suggested). For memory, in each slot we can simply request
> > the
> > > > > > amount of memory, leaving it to JVM / OS to decide which memory
> > > > (address)
> > > > > > should be assigned. For GPU, and potentially other resources like
> > > > FPGA, we
> > > > > > need to explicitly specify which GPU (index) should be used.
> > > > Therefore, we
> > > > > > need some component at the TM level to coordinate which slot uses
> > > which
> > > > > > GPU.
> > > > > >
> > > > > > IMO, unless we say Flink will not support slot-level GPU slicing at
> > > > least
> > > > > > in the foreseeable future, I don't see a good way to avoid touching
> > > > the TM
> > > > > > core. To that end, I think Becket's suggestion points to a good
> > > > direction,
> > > > > > that supports more features (GPU, FPGA, etc.) with less coupling to
> > > > the TM
> > > > > > core (only needs to understand the general interfaces). The
> > detailed
> > > > > > implementation for specific resource types can even be encapsulated
> > > as
> > > > a
> > > > > > library.
> > > > > >
> > > > > > @Becket
> > > > > >
> > > > > > Thanks for sharing your thought on the final state. Despite the
> > > > details how
> > > > > > the interfaces should look like, I think this is a really good
> > > > abstraction
> > > > > > for supporting general resource types.
> > > > > >
> > > > > > I'd like to further clarify that, the following three things are
> > all
> > > > that
> > > > > > the "Flink core" needs to understand.
> > > > > >
> > > > > > - The *amount* of resource, for scheduling. Actually, we already
> > > > have
> > > > > > the Resource class in ResourceProfile and ResourceSpec for
> > > extended
> > > > > > resource. It's just not really used.
> > > > > > - The *info*, that Flink provides to the operators / user codes.
> > > > > > - The *provider*, which generates the info based on the amount.
> > > > > >
> > > > > > The "core" does not need to understand the specific implementation
> > > > details
> > > > > > of the above three. They can even be implemented in a 3rd-party
> > > > library.
> > > > > > Similar to how we allow users to define their custom
> > MetricReporter.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <[hidden email]>
> > > > wrote:
> > > > > >
> > > > > > > Thanks for the comment, Stephan.
> > > > > > >
> > > > > > > - If everything becomes a "core feature", it will make the
> > > project
> > > > hard
> > > > > > > > to develop in the future. Thinking "library" / "plugin" /
> > > > "extension"
> > > > > > > style
> > > > > > > > where possible helps.
> > > > > > >
> > > > > > >
> > > > > > > Completely agree. It is much more important to design a mechanism
> > > > than
> > > > > > > focusing on a specific case. Here is what I am thinking to fully
> > > > support
> > > > > > > custom resource management:
> > > > > > > 1. On the JM / RM side, use ResourceProfile and ResourceSpec to
> > > > define
> > > > > > the
> > > > > > > resource and the amount required. They will be used to find
> > > suitable
> > > > TMs
> > > > > > > slots to run the tasks. At this point, the resources are only
> > > > measured by
> > > > > > > amount, i.e. they do not have individual ID.
> > > > > > >
> > > > > > > 2. On the TM side, have something like *"ResourceInfoProvider"*
> > to
> > > > > > identify
> > > > > > > and provides the detail information of the individual resource,
> > > e.g.
> > > > GPU
> > > > > > > ID.. It is important because the operator may have to explicitly
> > > > interact
> > > > > > > with the physical resource it uses. The ResourceInfoProvider
> > might
> > > > look
> > > > > > > like something below.
> > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId opId,
> > > > > > > ResourceProfile resourceProfile);
> > > > > > > }
> > > > > > >
> > > > > > > - There could be several "*ResourceInfoProvider*" configured on
> > the
> > > > TM to
> > > > > > > retrieve the information for different resources.
> > > > > > > - The TM will be responsible to assign those individual resources
> > > to
> > > > each
> > > > > > > operator according to their requested amount.
> > > > > > > - The operators will be able to get the ResourceInfo from their
> > > > > > > RuntimeContext.
> > > > > > >
> > > > > > > If we agree this is a reasonable final state. We can adapt the
> > > > current
> > > > > > FLIP
> > > > > > > to it. In fact it does not sound a big change to me. All the
> > > proposed
> > > > > > > configuration can be as is, it is just that Flink itself won't
> > care
> > > > about
> > > > > > > them, instead a GPUInfoProviver implementing the
> > > ResourceInfoProvider
> > > > > > will
> > > > > > > use them.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <[hidden email]>
> > > > wrote:
> > > > > > >
> > > > > > > > Hi all!
> > > > > > > >
> > > > > > > > The main point I wanted to throw into the discussion is the
> > > > following:
> > > > > > > > - With more and more use cases, more and more tools go into
> > > Flink
> > > > > > > > - If everything becomes a "core feature", it will make the
> > > > project
> > > > > > hard
> > > > > > > > to develop in the future. Thinking "library" / "plugin" /
> > > > "extension"
> > > > > > > style
> > > > > > > > where possible helps.
> > > > > > > >
> > > > > > > > - A good thought experiment is always: How many future
> > > developers
> > > > > > have
> > > > > > > to
> > > > > > > > interact with this code (and possibly understand it partially),
> > > > even if
> > > > > > > the
> > > > > > > > features they touch have nothing to do with GPU support. If
> > many
> > > > > > > > contributors to unrelated features will have to touch it and
> > > > understand
> > > > > > > it,
> > > > > > > > then let's think if there is a different solution. Maybe there
> > is
> > > > not,
> > > > > > > but
> > > > > > > > then we should be sure why.
> > > > > > > >
> > > > > > > > - That led me to raising this issue: If the GPU manager
> > > becomes a
> > > > > > core
> > > > > > > > service in the TaskManager, Environment, RuntimeContext, etc.
> > > then
> > > > > > > everyone
> > > > > > > > developing TM and streaming tasks need to understand the GPU
> > > > manager.
> > > > > > > That
> > > > > > > > seems oddly specific, is my impression.
> > > > > > > >
> > > > > > > > Access to configuration seems not the right reason to do that.
> > We
> > > > > > should
> > > > > > > > expose the Flink configuration from the RuntimeContext anyways.
> > > > > > > >
> > > > > > > > If GPUs are sliced and assigned during scheduling, there may be
> > > > reason,
> > > > > > > > although it looks that it would belong to the slot then. Is
> > that
> > > > what
> > > > > > we
> > > > > > > > are doing here?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Stephan
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > >
> > > > > > > > > IMO, eventually an operator should only see info of GPUs that
> > > are
> > > > > > > > dedicated
> > > > > > > > > for it, instead of all GPUs on the machine/container in the
> > > > current
> > > > > > > > design.
> > > > > > > > > It does not make sense to let the user who writes a UDF to
> > > worry
> > > > > > about
> > > > > > > > > coordination among multiple operators running on the same
> > > > machine.
> > > > > > And
> > > > > > > if
> > > > > > > > > we want to limit the GPU info an operator sees, we should not
> > > > let the
> > > > > > > > > operator to instantiate GPUManager, which means we have to
> > > expose
> > > > > > > > something
> > > > > > > > > through runtime context, either GPU info or some kind of
> > > limited
> > > > > > access
> > > > > > > > to
> > > > > > > > > the GPUManager.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > It probably make sense for us to first agree on the final
> > > > state.
> > > > > > More
> > > > > > > > > > specifically, will the resource info be exposed through
> > > runtime
> > > > > > > context
> > > > > > > > > > eventually?
> > > > > > > > > >
> > > > > > > > > > If that is the final state and we have a seamless migration
> > > > story
> > > > > > > from
> > > > > > > > > this
> > > > > > > > > > FLIP to that final state, Personally I think it is OK to
> > > > expose the
> > > > > > > GPU
> > > > > > > > > > info in the runtime context.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > @Yangze,
> > > > > > > > > > > I think what Stephan means (@Stephan, please correct me
> > if
> > > > I'm
> > > > > > > wrong)
> > > > > > > > > is
> > > > > > > > > > > that, we might not need to hold and maintain the
> > GPUManager
> > > > as a
> > > > > > > > > service
> > > > > > > > > > in
> > > > > > > > > > > TaskManagerServices or RuntimeContext. An alternative is
> > to
> > > > > > create
> > > > > > > /
> > > > > > > > > > > retrieve the GPUManager only in the operators that need
> > it,
> > > > e.g.,
> > > > > > > > with
> > > > > > > > > a
> > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > >
> > > > > > > > > > > @Stephan,
> > > > > > > > > > > I agree with you on excluding GPUManager from
> > > > > > TaskManagerServices.
> > > > > > > > > > >
> > > > > > > > > > > - For the first step, where we provide unified
> > TM-level
> > > > GPU
> > > > > > > > > > information
> > > > > > > > > > > to all operators, it should be fine to have operators
> > > > access /
> > > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > > - In future, we might have some more fine-grained GPU
> > > > > > > management,
> > > > > > > > > > where
> > > > > > > > > > > we need to maintain GPUManager as a service and put
> > GPU
> > > > info
> > > > > > in
> > > > > > > > slot
> > > > > > > > > > > profiles. But at least for now it's not necessary to
> > > > introduce
> > > > > > > > such
> > > > > > > > > > > complexity.
> > > > > > > > > > >
> > > > > > > > > > > However, I have some concerns on excluding GPUManager
> > from
> > > > > > > > > RuntimeContext
> > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > >
> > > > > > > > > > > - Configurations needed for creating the GPUManager is
> > > not
> > > > > > > always
> > > > > > > > > > > available for operators.
> > > > > > > > > > > - If later we want to have fine-grained control over
> > GPU
> > > > > > (e.g.,
> > > > > > > > > > > operators in each slot can only see GPUs reserved for
> > > that
> > > > > > > slot),
> > > > > > > > > the
> > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > >
> > > > > > > > > > > I would suggest to wrap the GPUManager behind
> > > RuntimeContext
> > > > and
> > > > > > > only
> > > > > > > > > > > expose the GPUInfo to users. For now, we can declare a
> > > method
> > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a default
> > definition
> > > > that
> > > > > > > > calls
> > > > > > > > > > > `GPUManager.get()` to get the lazily-created GPUManager.
> > If
> > > > later
> > > > > > > we
> > > > > > > > > want
> > > > > > > > > > > to create / retrieve GPUManager in a different way, we
> > can
> > > > simply
> > > > > > > > > change
> > > > > > > > > > > how `getGPUInfo` is implemented, without needing to
> > change
> > > > any
> > > > > > > public
> > > > > > > > > > > interfaces.
> > > > > > > > > > >
> > > > > > > > > > > Thank you~
> > > > > > > > > > >
> > > > > > > > > > > Xintong Song
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > @Shephan
> > > > > > > > > > > > Do you mean Minicluster? Yes, it makes sense to share
> > the
> > > > GPU
> > > > > > > > Manager
> > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > If that's what you worry about, I'm +1 for holding
> > > > > > > > > > > > GPUManager(ExternalResourceManagers) in TaskExecutor
> > > > instead of
> > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > >
> > > > > > > > > > > > Regarding the RuntimeContext/FunctionContext, it just
> > > > holds the
> > > > > > > GPU
> > > > > > > > > > > > info instead of the GPU Manager. AFAIK, it's the only
> > > > place we
> > > > > > > > could
> > > > > > > > > > > > pass GPU info to the RichFunction/UserDefinedFunction.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > > [hidden email]
> > > > > > wrote
> > > > > > > > > ----
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > Can we somehow keep this out of the TaskManager
> > > > services
> > > > > > > > > > > > > > I fear that we could not. IMO, the GPUManager(or
> > > > > > > > > > > > > > ExternalServicesManagers in future) is conceptually
> > > > one of
> > > > > > > the
> > > > > > > > > task
> > > > > > > > > > > > > > manager services, just like MemoryManager before
> > > 1.10.
> > > > > > > > > > > > > > - It maintains/holds the GPU resource at TM level
> > and
> > > > all
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > > > operators allocate the GPU resources from it. So,
> > it
> > > > should
> > > > > > > be
> > > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > > - We could add a collection called
> > > > ExternalResourceManagers
> > > > > > > to
> > > > > > > > > hold
> > > > > > > > > > > > > > all managers of other external resources in the
> > > future.
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Can you help me understand why this needs the
> > addition
> > > in
> > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > Are you worried about the case when multiple Task
> > > > Executors
> > > > > > run
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > same
> > > > > > > > > > > > > JVM? That's not common, but wouldn't it actually be
> > > good
> > > > in
> > > > > > > that
> > > > > > > > > case
> > > > > > > > > > > to
> > > > > > > > > > > > > share the GPU Manager, given that the GPU is shared?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Stephan
> > > > > > > > > > > > >
> > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > > > > In this FLIP, operators need the information. Thus,
> > > we
> > > > > > expose
> > > > > > > > GPU
> > > > > > > > > > > > > > information to the RuntimeContext/FunctionContext.
> > > The
> > > > slot
> > > > > > > > > profile
> > > > > > > > > > > is
> > > > > > > > > > > > > > not aware of GPU resources as GPU is TM level
> > > resource
> > > > now.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Can the GPU Manager be a "self contained" thing
> > > that
> > > > > > simply
> > > > > > > > > takes
> > > > > > > > > > > the
> > > > > > > > > > > > > > configuration, and then abstracts everything
> > > > internally?
> > > > > > > > > > > > > > Yes, we just pass the path/args of the discover
> > > script
> > > > and
> > > > > > > how
> > > > > > > > > many
> > > > > > > > > > > > > > GPUs per TM to it. It takes the responsibility to
> > get
> > > > the
> > > > > > GPU
> > > > > > > > > > > > > > information and expose them to the
> > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > of
> > > > > > > > > > > > > > Operators. Meanwhile, we'd better not allow
> > operators
> > > > to
> > > > > > > > directly
> > > > > > > > > > > > > > access GPUManager, it should get what they want
> > from
> > > > > > Context.
> > > > > > > > We
> > > > > > > > > > > could
> > > > > > > > > > > > > > then decouple the interface/implementation of
> > > > GPUManager
> > > > > > and
> > > > > > > > > Public
> > > > > > > > > > > > > > API.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It sounds fine to initially start with GPU
> > specific
> > > > > > support
> > > > > > > > and
> > > > > > > > > > > think
> > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > generalizing this once we better understand the
> > > > space.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > About the implementation suggested in FLIP-108:
> > > > > > > > > > > > > > > - Can we somehow keep this out of the TaskManager
> > > > > > services?
> > > > > > > > > > > Anything
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > have to pull through all layers of the TM makes
> > the
> > > > TM
> > > > > > > > > components
> > > > > > > > > > > yet
> > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - What parts need information about this?
> > > > > > > > > > > > > > > -> do the slot profiles need information about
> > the
> > > > GPU?
> > > > > > > > > > > > > > > -> Can the GPU Manager be a "self contained"
> > thing
> > > > that
> > > > > > > > simply
> > > > > > > > > > > takes
> > > > > > > > > > > > > > > the configuration, and then abstracts everything
> > > > > > > internally?
> > > > > > > > > > > > Operators
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <
> > > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're right,
> > > > I'll add
> > > > > > > > them
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > Regarding the general extended resource
> > > mechanism,
> > > > I
> > > > > > > second
> > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > - It's better to leverage ResourceProfile and
> > > > > > > ResourceSpec
> > > > > > > > > > after
> > > > > > > > > > > we
> > > > > > > > > > > > > > > > supporting fine-grained GPU scheduling. As a
> > > first
> > > > step
> > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > prefer to not include it in the scope of this
> > > FLIP.
> > > > > > > > > > > > > > > > - Regarding the "Extended Resource Manager",
> > if I
> > > > > > > > understand
> > > > > > > > > > > > > > > > correctly, it just a code refactoring atm, we
> > > could
> > > > > > > extract
> > > > > > > > > the
> > > > > > > > > > > > > > > > open/close/allocateExtendResources of
> > GPUManager
> > > to
> > > > > > that
> > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > that is the case, +1 to do it during
> > > > implementation.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > As Xintong said, we looked into how Spark
> > > supports
> > > > a
> > > > > > > > general
> > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > Resource Scheduling" before and decided to
> > > > introduce a
> > > > > > > > common
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > to make it more extensible. I think the
> > > "resource"
> > > > is a
> > > > > > > > > proper
> > > > > > > > > > > > level
> > > > > > > > > > > > > > > > to contain all the configs of extended
> > resources.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo Huang <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > There is no doubt that GPU resource
> > management
> > > > > > support
> > > > > > > > will
> > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > facilitate the development of AI-related
> > > > applications
> > > > > > > by
> > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I have only one comment about this wiki:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Regarding the names of several GPU
> > > > configurations, I
> > > > > > > > think
> > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > delete the resource field makes it consistent
> > > > with
> > > > > > the
> > > > > > > > > names
> > > > > > > > > > of
> > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > resource-related configurations in
> > > > TaskManagerOption.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > e.g.
> > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > ->
> > > > > > > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song <[hidden email]>
> > > > 于2020年3月4日周三
> > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also had an
> > > > offline
> > > > > > > > > discussion
> > > > > > > > > > > > about
> > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > the "GPU Support" as some general "Extended
> > > > > > Resource
> > > > > > > > > > > Support".
> > > > > > > > > > > > We
> > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > supporting extended resources in a general
> > > > > > mechanism
> > > > > > > is
> > > > > > > > > > > > definitely
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > and extensible way. The reason we propose
> > > this
> > > > FLIP
> > > > > > > > > > narrowing
> > > > > > > > > > > > its
> > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > down to GPU alone, is mainly for the
> > concern
> > > on
> > > > > > extra
> > > > > > > > > > efforts
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > capacity needed for a general mechanism.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > To come up with a well design on a general
> > > > extended
> > > > > > > > > > resource
> > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > mechanism, we would need to investigate
> > more
> > > > on how
> > > > > > > > > people
> > > > > > > > > > > use
> > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > kind of resources in practice. For GPU, we
> > > > learnt
> > > > > > > such
> > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > experts, Becket and his team members. But
> > for
> > > > FPGA,
> > > > > > > or
> > > > > > > > > > other
> > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > extended resources, we don't have such
> > > > convenient
> > > > > > > > > > information
> > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > making the investigation requires more
> > > efforts,
> > > > > > > which I
> > > > > > > > > > tend
> > > > > > > > > > > to
> > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On the other hand, we also looked into how
> > > > Spark
> > > > > > > > > supports a
> > > > > > > > > > > > general
> > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > Resource Scheduling". Assuming we want to
> > > have
> > > > a
> > > > > > > > similar
> > > > > > > > > > > > general
> > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > resource mechanism in the future, we
> > believe
> > > > that
> > > > > > the
> > > > > > > > > > current
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > design can be easily extended, in an
> > > > incremental
> > > > > > way
> > > > > > > > > > without
> > > > > > > > > > > > too
> > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > - The most important part is probably user
> > > > > > > interfaces.
> > > > > > > > > > Spark
> > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > configuration options to define the amount,
> > > > > > discovery
> > > > > > > > > > script
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > k8s) in a per resource type bias [1], which
> > > is
> > > > very
> > > > > > > > > similar
> > > > > > > > > > > to
> > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > proposed in this FLIP. I think it's not
> > > > necessary
> > > > > > to
> > > > > > > > > expose
> > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > in the general way atm, since we do not
> > have
> > > > > > supports
> > > > > > > > for
> > > > > > > > > > > other
> > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > types now. If later we decided to have per
> > > > resource
> > > > > > > > type
> > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > can have backwards compatibility on the
> > > current
> > > > > > > > proposed
> > > > > > > > > > > > options
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > - For the GPU Manager, if later needed we
> > can
> > > > > > change
> > > > > > > it
> > > > > > > > > to
> > > > > > > > > > a
> > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > Resource Manager" (or whatever it is
> > called).
> > > > That
> > > > > > > > should
> > > > > > > > > > be
> > > > > > > > > > > a
> > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > > > > - For ResourceProfile and ResourceSpec,
> > there
> > > > are
> > > > > > > > already
> > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > general extended resource. We can of course
> > > > > > leverage
> > > > > > > > them
> > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > fine grained GPU scheduling. That is also
> > not
> > > > in
> > > > > > the
> > > > > > > > > scope
> > > > > > > > > > of
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > step proposal, and would require FLIP-56 to
> > > be
> > > > > > > finished
> > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > To summary up, I agree with Becket that
> > have
> > > a
> > > > > > > separate
> > > > > > > > > > FLIP
> > > > > > > > > > > > for
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > general extended resource mechanism, and
> > keep
> > > > it in
> > > > > > > > mind
> > > > > > > > > > when
> > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> > https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket Qin <
> > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > That's a good point, Stephan. It makes
> > > total
> > > > > > sense
> > > > > > > to
> > > > > > > > > > > > generalize
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > resource management to support custom
> > > > resources.
> > > > > > > > Having
> > > > > > > > > > > that
> > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > to add new resources by themselves. The
> > > > general
> > > > > > > > > resource
> > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 1. The custom resource type definition.
> > It
> > > is
> > > > > > > > supported
> > > > > > > > > > by
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > resources in ResourceProfile and
> > > > ResourceSpec.
> > > > > > This
> > > > > > > > > will
> > > > > > > > > > > > likely
> > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2. The custom resource allocation logic,
> > > > i.e. how
> > > > > > > to
> > > > > > > > > > assign
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > to different tasks, operators, and so on.
> > > > This
> > > > > > may
> > > > > > > > > > require
> > > > > > > > > > > > two
> > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > a. Subtask level - make sure the subtasks
> > > > are put
> > > > > > > > into
> > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > It is done by the global RM and is not
> > > > > > customizable
> > > > > > > > > right
> > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > b. Operator level - map the exact
> > resource
> > > > to the
> > > > > > > > > > operators
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for operator
> > B.
> > > > This
> > > > > > > step
> > > > > > > > > is
> > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > the global RM does not distinguish
> > > individual
> > > > > > > > resources
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > It is true for memory, but not for GPU.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The GPU manager is designed to do 2.b
> > here.
> > > > So it
> > > > > > > > > should
> > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > physical GPU information and bind/match
> > > them
> > > > to
> > > > > > > each
> > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > general will fill in the missing piece to
> > > > support
> > > > > > > > > custom
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > definition. But I'd avoid calling it a
> > > > "External
> > > > > > > > > Resource
> > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > confusion with RM, maybe something like
> > > > "Operator
> > > > > > > > > > Resource
> > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > be more accurate. So for each resource
> > type
> > > > users
> > > > > > > can
> > > > > > > > > > have
> > > > > > > > > > > an
> > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > "Operator Resource Assigner" in the TM.
> > For
> > > > > > memory,
> > > > > > > > > users
> > > > > > > > > > > > don't
> > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > but for other extended resources, users
> > may
> > > > need
> > > > > > > > that.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Personally I think a pluggable "Operator
> > > > Resource
> > > > > > > > > > Assigner"
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > in this FLIP. But I am also OK with
> > having
> > > > that
> > > > > > in
> > > > > > > a
> > > > > > > > > > > separate
> > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > the interface between the "Operator
> > > Resource
> > > > > > > > Assigner"
> > > > > > > > > > and
> > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > take a while to settle down if we want to
> > > > make it
> > > > > > > > > > generic.
> > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > implementation should take this future
> > work
> > > > into
> > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > don't need to break backwards
> > compatibility
> > > > once
> > > > > > we
> > > > > > > > > have
> > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM Stephan
> > > Ewen
> > > > <
> > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thank you for writing this FLIP.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I cannot really give much input into
> > the
> > > > > > > mechanics
> > > > > > > > of
> > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > and GPU allocation, as I have no
> > > experience
> > > > > > with
> > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > One thought I had when reading the
> > > > proposal is
> > > > > > if
> > > > > > > > it
> > > > > > > > > > > makes
> > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > the "GPU Manager" as an "External
> > > Resource
> > > > > > > > Manager",
> > > > > > > > > > and
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > The way I understand the
> > ResourceProfile
> > > > and
> > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > It has the advantage that it looks more
> > > > > > > extensible.
> > > > > > > > > > Maybe
> > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU
> > > > Resource,
> > > > > > and
> > > > > > > > FPGA
> > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM Becket
> > > Qin <
> > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU
> > > resource
> > > > > > > > management
> > > > > > > > > > > > support
> > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > for machine learning use cases.
> > > Actually
> > > > it
> > > > > > is
> > > > > > > > one
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > question from the users who are
> > > > interested in
> > > > > > > > using
> > > > > > > > > > > Flink
> > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Some quick comments / questions to
> > the
> > > > wiki.
> > > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API should
> > probably
> > > > also
> > > > > > be
> > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > 2. Is the data structure that holds
> > GPU
> > > > info
> > > > > > > > also a
> > > > > > > > > > > > public
> > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM
> > Xintong
> > > > Song
> > > > > > <
> > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks for drafting the FLIP and
> > > > kicking
> > > > > > off
> > > > > > > > the
> > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Big +1 for this feature. Supporting
> > > > using
> > > > > > of
> > > > > > > > GPU
> > > > > > > > > in
> > > > > > > > > > > > Flink
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > especially for the ML scenarios.
> > > > > > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki doc and
> > > it
> > > > > > looks
> > > > > > > > good
> > > > > > > > > > to
> > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > very good first step for Flink's
> > GPU
> > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM
> > > Yangze
> > > > Guo
> > > > > > <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > We would like to start a
> > discussion
> > > > > > thread
> > > > > > > on
> > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly discusses the
> > > > following
> > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > - Enable user to configure how
> > many
> > > > GPUs
> > > > > > > in a
> > > > > > > > > > task
> > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > forward such requirements to the
> > > > external
> > > > > > > > > > resource
> > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos setups).
> > > > > > > > > > > > > > > > > > > > > > > - Provide information of
> > available
> > > > GPU
> > > > > > > > > resources
> > > > > > > > > > to
> > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Key changes proposed in the FLIP
> > > are
> > > > as
> > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > - Forward GPU resource
> > requirements
> > > > to
> > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > - Introduce GPUManager as one of
> > > the
> > > > task
> > > > > > > > > manager
> > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > and expose GPU resource
> > information
> > > > to
> > > > > > the
> > > > > > > > > > context
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > - Introduce the default script
> > for
> > > > GPU
> > > > > > > > > discovery,
> > > > > > > > > > > in
> > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > the privilege mode to help user
> > to
> > > > > > achieve
> > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Please find more details in the
> > > FLIP
> > > > wiki
> > > > > > > > > > document
> > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > > >
> > >
> >

Stephan Ewen

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Maybe one final comment: It is probably not an issue, but let's try and
keep user code (via user code classloader) out of the ResourceManager, if
possible.

As background:

There were thoughts in the past to support setups where the RM must run
with "superuser" credentials, but we cannot run JM/TM with these
credentials, as the user code might access them otherwise.
This is actually possible today, you can run the RM in a different JVM or
in a different container, and give it more credentials than JMs / TMs. But
for this to be feasible, we cannot allow any user-defined code to be in the
JVM, because that instantaneously breaks the isolation of credentials.

On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]> wrote:

> Thanks for the feedback, @Till and @Xintong.
>
> Regarding separating the interface, I'm also +1 with it.
>
> Regarding the resource allocation interface, true, it's dangerous to
> give much access to user codes. Changing the return type to Map<String
> key, String/Long value> makes sense to me. AFAIK, it is compatible
> with all the first-party supported resources for Yarn/Kubernetes. It
> could also free us from the potential dependency issue as well.
>
> Best,
> Yangze Guo
>
> On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <[hidden email]>
> wrote:
> >
> > Thanks for updating the FLIP, Yangze.
> >
> > I agree with Till that we probably want to separate the K8s/Yarn
> decorator
> > calls. Users can still configure one driver class, and we can use
> > `instanceof` to check whether the driver implemented K8s/Yarn specific
> > interfaces.
> >
> > Moreover, I'm not sure about exposing entire `ContainerRequest` / `Pod`
> > (`AbstractKubernetesStepDecorator` directly manipulates on `Pod`) to user
> > codes. It gives more access to user codes than needed for defining
> external
> > resource, which might cause problems. Instead, I would suggest to have
> > interface like `Map<String key, String value>
> > getYarn/KubernetesExternalResource()` and assemble them into
> > `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <[hidden email]>
> wrote:
> >
> > > Hi everyone,
> > >
> > > I'm a bit late to the party. I think the current proposal looks good.
> > >
> > > Concerning the ExternalResourceDriver interface defined in the FLIP
> [1], I
> > > would suggest to not include the decorator calls for Kubernetes and
> Yarn in
> > > the base interface. Instead I would suggest to segregate the deployment
> > > specific decorator calls into separate interfaces. That way an
> > > ExternalResourceDriver does not have to support all deployments from
> the
> > > very beginning. Moreover, some resources might not be supported by a
> > > specific deployment target and the natural way to express this would
> be to
> > > not implement the respective deployment specific interface.
> > >
> > > Moreover, having void
> > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> containerRequest)
> > > in the ExternalResourceDriver interface would require Hadoop on Flink's
> > > classpath whenever the external resource driver is being used.
> > >
> > > [1]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > >
> > > Cheers,
> > > Till
> > >
> > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <[hidden email]>
> wrote:
> > >
> > > > Nice, thanks a lot!
> > > >
> > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <[hidden email]>
> wrote:
> > > >
> > > > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > > > >
> > > > > I've updated the FLIP accordingly. I do not add a
> > > > > ResourceInfoProvider. Instead, I introduce the
> ExternalResourceDriver,
> > > > > which takes the responsibility of all relevant operations on both
> RM
> > > > > and TM sides.
> > > > > After a rethink about decoupling the management of external
> resources
> > > > > from TaskExecutor, I think we could do the same thing on the
> > > > > ResourceManager side. We do not need to add a specific allocation
> > > > > logic to the ResourceManager each time we add a specific external
> > > > > resource.
> > > > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > > > containerRequest.
> > > > > - For Kubenetes, ExternalResourceDriver could provide a decorator
> for
> > > > > the TM pod.
> > > > >
> > > > > In this way, just like MetricReporter, we allow users to define
> their
> > > > > custom ExternalResourceDriver. It is more extensible and fits the
> > > > > separation of concerns. For more details, please take a look at
> [1].
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <[hidden email]>
> wrote:
> > > > > >
> > > > > > This sounds good to go ahead from my side.
> > > > > >
> > > > > > I like the approach that Becket suggested - in that case the core
> > > > > > abstraction that everyone would need to understand would be
> "external
> > > > > > resource allocation" and the "ResourceInfoProvider", and the GPU
> > > > specific
> > > > > > code would be a specific implementation only known to that
> component
> > > > that
> > > > > > allocates the external resource. That fits the separation of
> concerns
> > > > > well.
> > > > > >
> > > > > > I also understand that it should not be over-engineered in the
> first
> > > > > > version, so some simplification makes sense, and then gradually
> > > expand
> > > > > from
> > > > > > there.
> > > > > >
> > > > > > So +1 to go ahead with what was suggested above (Xintong /
> Becket)
> > > from
> > > > > my
> > > > > > side.
> > > > > >
> > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> [hidden email]>
> > > > > wrote:
> > > > > >
> > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > >
> > > > > > > @Stephan
> > > > > > >
> > > > > > > I see your concern, and I completely agree with you that we
> should
> > > > > first
> > > > > > > think about the "library" / "plugin" / "extension" style if
> > > possible.
> > > > > > >
> > > > > > > If GPUs are sliced and assigned during scheduling, there may be
> > > > reason,
> > > > > > > > although it looks that it would belong to the slot then. Is
> that
> > > > > what we
> > > > > > > > are doing here?
> > > > > > >
> > > > > > >
> > > > > > > In the current proposal, we do not have the GPUs sliced and
> > > assigned
> > > > to
> > > > > > > slots, because it could be problematic without dynamic slot
> > > > allocation.
> > > > > > > E.g., the number of GPUs might not be evenly divisible by the
> > > number
> > > > of
> > > > > > > slots.
> > > > > > >
> > > > > > > I think it makes sense to eventually have the GPUs assigned to
> > > slots.
> > > > > Even
> > > > > > > then, we might still need a TM level GPUManager (or
> > > ResourceProvider
> > > > > like
> > > > > > > Becket suggested). For memory, in each slot we can simply
> request
> > > the
> > > > > > > amount of memory, leaving it to JVM / OS to decide which memory
> > > > > (address)
> > > > > > > should be assigned. For GPU, and potentially other resources
> like
> > > > > FPGA, we
> > > > > > > need to explicitly specify which GPU (index) should be used.
> > > > > Therefore, we
> > > > > > > need some component at the TM level to coordinate which slot
> uses
> > > > which
> > > > > > > GPU.
> > > > > > >
> > > > > > > IMO, unless we say Flink will not support slot-level GPU
> slicing at
> > > > > least
> > > > > > > in the foreseeable future, I don't see a good way to avoid
> touching
> > > > > the TM
> > > > > > > core. To that end, I think Becket's suggestion points to a good
> > > > > direction,
> > > > > > > that supports more features (GPU, FPGA, etc.) with less
> coupling to
> > > > > the TM
> > > > > > > core (only needs to understand the general interfaces). The
> > > detailed
> > > > > > > implementation for specific resource types can even be
> encapsulated
> > > > as
> > > > > a
> > > > > > > library.
> > > > > > >
> > > > > > > @Becket
> > > > > > >
> > > > > > > Thanks for sharing your thought on the final state. Despite the
> > > > > details how
> > > > > > > the interfaces should look like, I think this is a really good
> > > > > abstraction
> > > > > > > for supporting general resource types.
> > > > > > >
> > > > > > > I'd like to further clarify that, the following three things
> are
> > > all
> > > > > that
> > > > > > > the "Flink core" needs to understand.
> > > > > > >
> > > > > > > - The *amount* of resource, for scheduling. Actually, we
> already
> > > > > have
> > > > > > > the Resource class in ResourceProfile and ResourceSpec for
> > > > extended
> > > > > > > resource. It's just not really used.
> > > > > > > - The *info*, that Flink provides to the operators / user
> codes.
> > > > > > > - The *provider*, which generates the info based on the
> amount.
> > > > > > >
> > > > > > > The "core" does not need to understand the specific
> implementation
> > > > > details
> > > > > > > of the above three. They can even be implemented in a 3rd-party
> > > > > library.
> > > > > > > Similar to how we allow users to define their custom
> > > MetricReporter.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> [hidden email]>
> > > > > wrote:
> > > > > > >
> > > > > > > > Thanks for the comment, Stephan.
> > > > > > > >
> > > > > > > > - If everything becomes a "core feature", it will make the
> > > > project
> > > > > hard
> > > > > > > > > to develop in the future. Thinking "library" / "plugin" /
> > > > > "extension"
> > > > > > > > style
> > > > > > > > > where possible helps.
> > > > > > > >
> > > > > > > >
> > > > > > > > Completely agree. It is much more important to design a
> mechanism
> > > > > than
> > > > > > > > focusing on a specific case. Here is what I am thinking to
> fully
> > > > > support
> > > > > > > > custom resource management:
> > > > > > > > 1. On the JM / RM side, use ResourceProfile and ResourceSpec
> to
> > > > > define
> > > > > > > the
> > > > > > > > resource and the amount required. They will be used to find
> > > > suitable
> > > > > TMs
> > > > > > > > slots to run the tasks. At this point, the resources are only
> > > > > measured by
> > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > >
> > > > > > > > 2. On the TM side, have something like
> *"ResourceInfoProvider"*
> > > to
> > > > > > > identify
> > > > > > > > and provides the detail information of the individual
> resource,
> > > > e.g.
> > > > > GPU
> > > > > > > > ID.. It is important because the operator may have to
> explicitly
> > > > > interact
> > > > > > > > with the physical resource it uses. The ResourceInfoProvider
> > > might
> > > > > look
> > > > > > > > like something below.
> > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId
> opId,
> > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > }
> > > > > > > >
> > > > > > > > - There could be several "*ResourceInfoProvider*" configured
> on
> > > the
> > > > > TM to
> > > > > > > > retrieve the information for different resources.
> > > > > > > > - The TM will be responsible to assign those individual
> resources
> > > > to
> > > > > each
> > > > > > > > operator according to their requested amount.
> > > > > > > > - The operators will be able to get the ResourceInfo from
> their
> > > > > > > > RuntimeContext.
> > > > > > > >
> > > > > > > > If we agree this is a reasonable final state. We can adapt
> the
> > > > > current
> > > > > > > FLIP
> > > > > > > > to it. In fact it does not sound a big change to me. All the
> > > > proposed
> > > > > > > > configuration can be as is, it is just that Flink itself
> won't
> > > care
> > > > > about
> > > > > > > > them, instead a GPUInfoProviver implementing the
> > > > ResourceInfoProvider
> > > > > > > will
> > > > > > > > use them.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> [hidden email]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all!
> > > > > > > > >
> > > > > > > > > The main point I wanted to throw into the discussion is the
> > > > > following:
> > > > > > > > > - With more and more use cases, more and more tools go
> into
> > > > Flink
> > > > > > > > > - If everything becomes a "core feature", it will make
> the
> > > > > project
> > > > > > > hard
> > > > > > > > > to develop in the future. Thinking "library" / "plugin" /
> > > > > "extension"
> > > > > > > > style
> > > > > > > > > where possible helps.
> > > > > > > > >
> > > > > > > > > - A good thought experiment is always: How many future
> > > > developers
> > > > > > > have
> > > > > > > > to
> > > > > > > > > interact with this code (and possibly understand it
> partially),
> > > > > even if
> > > > > > > > the
> > > > > > > > > features they touch have nothing to do with GPU support. If
> > > many
> > > > > > > > > contributors to unrelated features will have to touch it
> and
> > > > > understand
> > > > > > > > it,
> > > > > > > > > then let's think if there is a different solution. Maybe
> there
> > > is
> > > > > not,
> > > > > > > > but
> > > > > > > > > then we should be sure why.
> > > > > > > > >
> > > > > > > > > - That led me to raising this issue: If the GPU manager
> > > > becomes a
> > > > > > > core
> > > > > > > > > service in the TaskManager, Environment, RuntimeContext,
> etc.
> > > > then
> > > > > > > > everyone
> > > > > > > > > developing TM and streaming tasks need to understand the
> GPU
> > > > > manager.
> > > > > > > > That
> > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > >
> > > > > > > > > Access to configuration seems not the right reason to do
> that.
> > > We
> > > > > > > should
> > > > > > > > > expose the Flink configuration from the RuntimeContext
> anyways.
> > > > > > > > >
> > > > > > > > > If GPUs are sliced and assigned during scheduling, there
> may be
> > > > > reason,
> > > > > > > > > although it looks that it would belong to the slot then. Is
> > > that
> > > > > what
> > > > > > > we
> > > > > > > > > are doing here?
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Stephan
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > >
> > > > > > > > > > IMO, eventually an operator should only see info of GPUs
> that
> > > > are
> > > > > > > > > dedicated
> > > > > > > > > > for it, instead of all GPUs on the machine/container in
> the
> > > > > current
> > > > > > > > > design.
> > > > > > > > > > It does not make sense to let the user who writes a UDF
> to
> > > > worry
> > > > > > > about
> > > > > > > > > > coordination among multiple operators running on the same
> > > > > machine.
> > > > > > > And
> > > > > > > > if
> > > > > > > > > > we want to limit the GPU info an operator sees, we
> should not
> > > > > let the
> > > > > > > > > > operator to instantiate GPUManager, which means we have
> to
> > > > expose
> > > > > > > > > something
> > > > > > > > > > through runtime context, either GPU info or some kind of
> > > > limited
> > > > > > > access
> > > > > > > > > to
> > > > > > > > > > the GPUManager.
> > > > > > > > > >
> > > > > > > > > > Thank you~
> > > > > > > > > >
> > > > > > > > > > Xintong Song
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > It probably make sense for us to first agree on the
> final
> > > > > state.
> > > > > > > More
> > > > > > > > > > > specifically, will the resource info be exposed through
> > > > runtime
> > > > > > > > context
> > > > > > > > > > > eventually?
> > > > > > > > > > >
> > > > > > > > > > > If that is the final state and we have a seamless
> migration
> > > > > story
> > > > > > > > from
> > > > > > > > > > this
> > > > > > > > > > > FLIP to that final state, Personally I think it is OK
> to
> > > > > expose the
> > > > > > > > GPU
> > > > > > > > > > > info in the runtime context.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > I think what Stephan means (@Stephan, please correct
> me
> > > if
> > > > > I'm
> > > > > > > > wrong)
> > > > > > > > > > is
> > > > > > > > > > > > that, we might not need to hold and maintain the
> > > GPUManager
> > > > > as a
> > > > > > > > > > service
> > > > > > > > > > > in
> > > > > > > > > > > > TaskManagerServices or RuntimeContext. An
> alternative is
> > > to
> > > > > > > create
> > > > > > > > /
> > > > > > > > > > > > retrieve the GPUManager only in the operators that
> need
> > > it,
> > > > > e.g.,
> > > > > > > > > with
> > > > > > > > > > a
> > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > >
> > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > I agree with you on excluding GPUManager from
> > > > > > > TaskManagerServices.
> > > > > > > > > > > >
> > > > > > > > > > > > - For the first step, where we provide unified
> > > TM-level
> > > > > GPU
> > > > > > > > > > > information
> > > > > > > > > > > > to all operators, it should be fine to have
> operators
> > > > > access /
> > > > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > > > - In future, we might have some more fine-grained
> GPU
> > > > > > > > management,
> > > > > > > > > > > where
> > > > > > > > > > > > we need to maintain GPUManager as a service and
> put
> > > GPU
> > > > > info
> > > > > > > in
> > > > > > > > > slot
> > > > > > > > > > > > profiles. But at least for now it's not necessary
> to
> > > > > introduce
> > > > > > > > > such
> > > > > > > > > > > > complexity.
> > > > > > > > > > > >
> > > > > > > > > > > > However, I have some concerns on excluding GPUManager
> > > from
> > > > > > > > > > RuntimeContext
> > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > >
> > > > > > > > > > > > - Configurations needed for creating the
> GPUManager is
> > > > not
> > > > > > > > always
> > > > > > > > > > > > available for operators.
> > > > > > > > > > > > - If later we want to have fine-grained control
> over
> > > GPU
> > > > > > > (e.g.,
> > > > > > > > > > > > operators in each slot can only see GPUs reserved
> for
> > > > that
> > > > > > > > slot),
> > > > > > > > > > the
> > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > >
> > > > > > > > > > > > I would suggest to wrap the GPUManager behind
> > > > RuntimeContext
> > > > > and
> > > > > > > > only
> > > > > > > > > > > > expose the GPUInfo to users. For now, we can declare
> a
> > > > method
> > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a default
> > > definition
> > > > > that
> > > > > > > > > calls
> > > > > > > > > > > > `GPUManager.get()` to get the lazily-created
> GPUManager.
> > > If
> > > > > later
> > > > > > > > we
> > > > > > > > > > want
> > > > > > > > > > > > to create / retrieve GPUManager in a different way,
> we
> > > can
> > > > > simply
> > > > > > > > > > change
> > > > > > > > > > > > how `getGPUInfo` is implemented, without needing to
> > > change
> > > > > any
> > > > > > > > public
> > > > > > > > > > > > interfaces.
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you~
> > > > > > > > > > > >
> > > > > > > > > > > > Xintong Song
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > Do you mean Minicluster? Yes, it makes sense to
> share
> > > the
> > > > > GPU
> > > > > > > > > Manager
> > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > If that's what you worry about, I'm +1 for holding
> > > > > > > > > > > > > GPUManager(ExternalResourceManagers) in
> TaskExecutor
> > > > > instead of
> > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regarding the RuntimeContext/FunctionContext, it
> just
> > > > > holds the
> > > > > > > > GPU
> > > > > > > > > > > > > info instead of the GPU Manager. AFAIK, it's the
> only
> > > > > place we
> > > > > > > > > could
> > > > > > > > > > > > > pass GPU info to the
> RichFunction/UserDefinedFunction.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > > > [hidden email]
> > > > > > > wrote
> > > > > > > > > > ----
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Can we somehow keep this out of the
> TaskManager
> > > > > services
> > > > > > > > > > > > > > > I fear that we could not. IMO, the
> GPUManager(or
> > > > > > > > > > > > > > > ExternalServicesManagers in future) is
> conceptually
> > > > > one of
> > > > > > > > the
> > > > > > > > > > task
> > > > > > > > > > > > > > > manager services, just like MemoryManager
> before
> > > > 1.10.
> > > > > > > > > > > > > > > - It maintains/holds the GPU resource at TM
> level
> > > and
> > > > > all
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > > > > operators allocate the GPU resources from it.
> So,
> > > it
> > > > > should
> > > > > > > > be
> > > > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > > > - We could add a collection called
> > > > > ExternalResourceManagers
> > > > > > > > to
> > > > > > > > > > hold
> > > > > > > > > > > > > > > all managers of other external resources in the
> > > > future.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Can you help me understand why this needs the
> > > addition
> > > > in
> > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > Are you worried about the case when multiple Task
> > > > > Executors
> > > > > > > run
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > same
> > > > > > > > > > > > > > JVM? That's not common, but wouldn't it actually
> be
> > > > good
> > > > > in
> > > > > > > > that
> > > > > > > > > > case
> > > > > > > > > > > > to
> > > > > > > > > > > > > > share the GPU Manager, given that the GPU is
> shared?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > > > > > In this FLIP, operators need the information.
> Thus,
> > > > we
> > > > > > > expose
> > > > > > > > > GPU
> > > > > > > > > > > > > > > information to the
> RuntimeContext/FunctionContext.
> > > > The
> > > > > slot
> > > > > > > > > > profile
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > not aware of GPU resources as GPU is TM level
> > > > resource
> > > > > now.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Can the GPU Manager be a "self contained"
> thing
> > > > that
> > > > > > > simply
> > > > > > > > > > takes
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > configuration, and then abstracts everything
> > > > > internally?
> > > > > > > > > > > > > > > Yes, we just pass the path/args of the discover
> > > > script
> > > > > and
> > > > > > > > how
> > > > > > > > > > many
> > > > > > > > > > > > > > > GPUs per TM to it. It takes the responsibility
> to
> > > get
> > > > > the
> > > > > > > GPU
> > > > > > > > > > > > > > > information and expose them to the
> > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > of
> > > > > > > > > > > > > > > Operators. Meanwhile, we'd better not allow
> > > operators
> > > > > to
> > > > > > > > > directly
> > > > > > > > > > > > > > > access GPUManager, it should get what they want
> > > from
> > > > > > > Context.
> > > > > > > > > We
> > > > > > > > > > > > could
> > > > > > > > > > > > > > > then decouple the interface/implementation of
> > > > > GPUManager
> > > > > > > and
> > > > > > > > > > Public
> > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > It sounds fine to initially start with GPU
> > > specific
> > > > > > > support
> > > > > > > > > and
> > > > > > > > > > > > think
> > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > generalizing this once we better understand
> the
> > > > > space.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > About the implementation suggested in
> FLIP-108:
> > > > > > > > > > > > > > > > - Can we somehow keep this out of the
> TaskManager
> > > > > > > services?
> > > > > > > > > > > > Anything
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > have to pull through all layers of the TM
> makes
> > > the
> > > > > TM
> > > > > > > > > > components
> > > > > > > > > > > > yet
> > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > - What parts need information about this?
> > > > > > > > > > > > > > > > -> do the slot profiles need information
> about
> > > the
> > > > > GPU?
> > > > > > > > > > > > > > > > -> Can the GPU Manager be a "self contained"
> > > thing
> > > > > that
> > > > > > > > > simply
> > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > the configuration, and then abstracts
> everything
> > > > > > > > internally?
> > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're
> right,
> > > > > I'll add
> > > > > > > > > them
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > Regarding the general extended resource
> > > > mechanism,
> > > > > I
> > > > > > > > second
> > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > - It's better to leverage ResourceProfile
> and
> > > > > > > > ResourceSpec
> > > > > > > > > > > after
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > supporting fine-grained GPU scheduling. As
> a
> > > > first
> > > > > step
> > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > prefer to not include it in the scope of
> this
> > > > FLIP.
> > > > > > > > > > > > > > > > > - Regarding the "Extended Resource
> Manager",
> > > if I
> > > > > > > > > understand
> > > > > > > > > > > > > > > > > correctly, it just a code refactoring atm,
> we
> > > > could
> > > > > > > > extract
> > > > > > > > > > the
> > > > > > > > > > > > > > > > > open/close/allocateExtendResources of
> > > GPUManager
> > > > to
> > > > > > > that
> > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > that is the case, +1 to do it during
> > > > > implementation.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > As Xintong said, we looked into how Spark
> > > > supports
> > > > > a
> > > > > > > > > general
> > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > Resource Scheduling" before and decided to
> > > > > introduce a
> > > > > > > > > common
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > >
> schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > to make it more extensible. I think the
> > > > "resource"
> > > > > is a
> > > > > > > > > > proper
> > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > to contain all the configs of extended
> > > resources.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo
> Huang <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > There is no doubt that GPU resource
> > > management
> > > > > > > support
> > > > > > > > > will
> > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > facilitate the development of AI-related
> > > > > applications
> > > > > > > > by
> > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I have only one comment about this wiki:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regarding the names of several GPU
> > > > > configurations, I
> > > > > > > > > think
> > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > delete the resource field makes it
> consistent
> > > > > with
> > > > > > > the
> > > > > > > > > > names
> > > > > > > > > > > of
> > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > resource-related configurations in
> > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > e.g.
> > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > ->
> > > > > > > > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]>
> > > > > 于2020年3月4日周三
> > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also had
> an
> > > > > offline
> > > > > > > > > > discussion
> > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > the "GPU Support" as some general
> "Extended
> > > > > > > Resource
> > > > > > > > > > > > Support".
> > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > supporting extended resources in a
> general
> > > > > > > mechanism
> > > > > > > > is
> > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > and extensible way. The reason we
> propose
> > > > this
> > > > > FLIP
> > > > > > > > > > > narrowing
> > > > > > > > > > > > > its
> > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > down to GPU alone, is mainly for the
> > > concern
> > > > on
> > > > > > > extra
> > > > > > > > > > > efforts
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > capacity needed for a general
> mechanism.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > To come up with a well design on a
> general
> > > > > extended
> > > > > > > > > > > resource
> > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > mechanism, we would need to investigate
> > > more
> > > > > on how
> > > > > > > > > > people
> > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > kind of resources in practice. For
> GPU, we
> > > > > learnt
> > > > > > > > such
> > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > experts, Becket and his team members.
> But
> > > for
> > > > > FPGA,
> > > > > > > > or
> > > > > > > > > > > other
> > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > extended resources, we don't have such
> > > > > convenient
> > > > > > > > > > > information
> > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > making the investigation requires more
> > > > efforts,
> > > > > > > > which I
> > > > > > > > > > > tend
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On the other hand, we also looked into
> how
> > > > > Spark
> > > > > > > > > > supports a
> > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > Resource Scheduling". Assuming we want
> to
> > > > have
> > > > > a
> > > > > > > > > similar
> > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > resource mechanism in the future, we
> > > believe
> > > > > that
> > > > > > > the
> > > > > > > > > > > current
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > design can be easily extended, in an
> > > > > incremental
> > > > > > > way
> > > > > > > > > > > without
> > > > > > > > > > > > > too
> > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > - The most important part is probably
> user
> > > > > > > > interfaces.
> > > > > > > > > > > Spark
> > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > configuration options to define the
> amount,
> > > > > > > discovery
> > > > > > > > > > > script
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > k8s) in a per resource type bias [1],
> which
> > > > is
> > > > > very
> > > > > > > > > > similar
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > proposed in this FLIP. I think it's not
> > > > > necessary
> > > > > > > to
> > > > > > > > > > expose
> > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > in the general way atm, since we do not
> > > have
> > > > > > > supports
> > > > > > > > > for
> > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > types now. If later we decided to have
> per
> > > > > resource
> > > > > > > > > type
> > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > can have backwards compatibility on the
> > > > current
> > > > > > > > > proposed
> > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > - For the GPU Manager, if later needed
> we
> > > can
> > > > > > > change
> > > > > > > > it
> > > > > > > > > > to
> > > > > > > > > > > a
> > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > Resource Manager" (or whatever it is
> > > called).
> > > > > That
> > > > > > > > > should
> > > > > > > > > > > be
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > > > > > - For ResourceProfile and ResourceSpec,
> > > there
> > > > > are
> > > > > > > > > already
> > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > general extended resource. We can of
> course
> > > > > > > leverage
> > > > > > > > > them
> > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > fine grained GPU scheduling. That is
> also
> > > not
> > > > > in
> > > > > > > the
> > > > > > > > > > scope
> > > > > > > > > > > of
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > step proposal, and would require
> FLIP-56 to
> > > > be
> > > > > > > > finished
> > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > To summary up, I agree with Becket that
> > > have
> > > > a
> > > > > > > > separate
> > > > > > > > > > > FLIP
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > general extended resource mechanism,
> and
> > > keep
> > > > > it in
> > > > > > > > > mind
> > > > > > > > > > > when
> > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket
> Qin <
> > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > That's a good point, Stephan. It
> makes
> > > > total
> > > > > > > sense
> > > > > > > > to
> > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > resource management to support custom
> > > > > resources.
> > > > > > > > > Having
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > to add new resources by themselves.
> The
> > > > > general
> > > > > > > > > > resource
> > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 1. The custom resource type
> definition.
> > > It
> > > > is
> > > > > > > > > supported
> > > > > > > > > > > by
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > resources in ResourceProfile and
> > > > > ResourceSpec.
> > > > > > > This
> > > > > > > > > > will
> > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2. The custom resource allocation
> logic,
> > > > > i.e. how
> > > > > > > > to
> > > > > > > > > > > assign
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > to different tasks, operators, and
> so on.
> > > > > This
> > > > > > > may
> > > > > > > > > > > require
> > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > a. Subtask level - make sure the
> subtasks
> > > > > are put
> > > > > > > > > into
> > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > It is done by the global RM and is
> not
> > > > > > > customizable
> > > > > > > > > > right
> > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > b. Operator level - map the exact
> > > resource
> > > > > to the
> > > > > > > > > > > operators
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for
> operator
> > > B.
> > > > > This
> > > > > > > > step
> > > > > > > > > > is
> > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > the global RM does not distinguish
> > > > individual
> > > > > > > > > resources
> > > > > > > > > > > of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > It is true for memory, but not for
> GPU.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > The GPU manager is designed to do 2.b
> > > here.
> > > > > So it
> > > > > > > > > > should
> > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > physical GPU information and
> bind/match
> > > > them
> > > > > to
> > > > > > > > each
> > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > general will fill in the missing
> piece to
> > > > > support
> > > > > > > > > > custom
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > definition. But I'd avoid calling it
> a
> > > > > "External
> > > > > > > > > > Resource
> > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > confusion with RM, maybe something
> like
> > > > > "Operator
> > > > > > > > > > > Resource
> > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > be more accurate. So for each
> resource
> > > type
> > > > > users
> > > > > > > > can
> > > > > > > > > > > have
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > "Operator Resource Assigner" in the
> TM.
> > > For
> > > > > > > memory,
> > > > > > > > > > users
> > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > but for other extended resources,
> users
> > > may
> > > > > need
> > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Personally I think a pluggable
> "Operator
> > > > > Resource
> > > > > > > > > > > Assigner"
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > in this FLIP. But I am also OK with
> > > having
> > > > > that
> > > > > > > in
> > > > > > > > a
> > > > > > > > > > > > separate
> > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > the interface between the "Operator
> > > > Resource
> > > > > > > > > Assigner"
> > > > > > > > > > > and
> > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > take a while to settle down if we
> want to
> > > > > make it
> > > > > > > > > > > generic.
> > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > implementation should take this
> future
> > > work
> > > > > into
> > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > don't need to break backwards
> > > compatibility
> > > > > once
> > > > > > > we
> > > > > > > > > > have
> > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM
> Stephan
> > > > Ewen
> > > > > <
> > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thank you for writing this FLIP.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I cannot really give much input
> into
> > > the
> > > > > > > > mechanics
> > > > > > > > > of
> > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > and GPU allocation, as I have no
> > > > experience
> > > > > > > with
> > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > One thought I had when reading the
> > > > > proposal is
> > > > > > > if
> > > > > > > > > it
> > > > > > > > > > > > makes
> > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as an "External
> > > > Resource
> > > > > > > > > Manager",
> > > > > > > > > > > and
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > The way I understand the
> > > ResourceProfile
> > > > > and
> > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > It has the advantage that it looks
> more
> > > > > > > > extensible.
> > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU
> > > > > Resource,
> > > > > > > and
> > > > > > > > > FPGA
> > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM
> Becket
> > > > Qin <
> > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU
> > > > resource
> > > > > > > > > management
> > > > > > > > > > > > > support
> > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > for machine learning use cases.
> > > > Actually
> > > > > it
> > > > > > > is
> > > > > > > > > one
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > question from the users who are
> > > > > interested in
> > > > > > > > > using
> > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Some quick comments / questions
> to
> > > the
> > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API should
> > > probably
> > > > > also
> > > > > > > be
> > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > 2. Is the data structure that
> holds
> > > GPU
> > > > > info
> > > > > > > > > also a
> > > > > > > > > > > > > public
> > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM
> > > Xintong
> > > > > Song
> > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thanks for drafting the FLIP
> and
> > > > > kicking
> > > > > > > off
> > > > > > > > > the
> > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Big +1 for this feature.
> Supporting
> > > > > using
> > > > > > > of
> > > > > > > > > GPU
> > > > > > > > > > in
> > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > especially for the ML
> scenarios.
> > > > > > > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki
> doc and
> > > > it
> > > > > > > looks
> > > > > > > > > good
> > > > > > > > > > > to
> > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > very good first step for
> Flink's
> > > GPU
> > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM
> > > > Yangze
> > > > > Guo
> > > > > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > We would like to start a
> > > discussion
> > > > > > > thread
> > > > > > > > on
> > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly discusses
> the
> > > > > following
> > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > - Enable user to configure
> how
> > > many
> > > > > GPUs
> > > > > > > > in a
> > > > > > > > > > > task
> > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > forward such requirements to
> the
> > > > > external
> > > > > > > > > > > resource
> > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos
> setups).
> > > > > > > > > > > > > > > > > > > > > > > > - Provide information of
> > > available
> > > > > GPU
> > > > > > > > > > resources
> > > > > > > > > > > to
> > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Key changes proposed in the
> FLIP
> > > > are
> > > > > as
> > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU resource
> > > requirements
> > > > > to
> > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > - Introduce GPUManager as
> one of
> > > > the
> > > > > task
> > > > > > > > > > manager
> > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > and expose GPU resource
> > > information
> > > > > to
> > > > > > > the
> > > > > > > > > > > context
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > - Introduce the default
> script
> > > for
> > > > > GPU
> > > > > > > > > > discovery,
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > the privilege mode to help
> user
> > > to
> > > > > > > achieve
> > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Please find more details in
> the
> > > > FLIP
> > > > > wiki
> > > > > > > > > > > document
> > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > > >
> > > >
> > >
>

Yangze Guo

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Hi, Stephan,

I see your concern and I totally agree with you.

The interface on RM side is now `Map<String key, String/Long value>
getYarn/KubernetesExternalResource()`. The only valid information RM
get from it is the configuration key of that external resource in
Yarn/K8s. The "String/Long value" would be the same as the
external-resource.{resourceName}.amount.
So, I think it makes sense to replace these two interfaces with two
configs, i.e. external-resource.{resourceName}.yarn/kubernetes.key. We
may lose some extensibility, but AFAIK it could work with common
external resources like GPU, FPGA. WDYT?

Best,
Yangze Guo

On Fri, Mar 27, 2020 at 7:59 PM Stephan Ewen <[hidden email]> wrote:

>
> Maybe one final comment: It is probably not an issue, but let's try and
> keep user code (via user code classloader) out of the ResourceManager, if
> possible.
>
> As background:
>
> There were thoughts in the past to support setups where the RM must run
> with "superuser" credentials, but we cannot run JM/TM with these
> credentials, as the user code might access them otherwise.
> This is actually possible today, you can run the RM in a different JVM or
> in a different container, and give it more credentials than JMs / TMs. But
> for this to be feasible, we cannot allow any user-defined code to be in the
> JVM, because that instantaneously breaks the isolation of credentials.
>
>
>
> On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]> wrote:
>
> > Thanks for the feedback, @Till and @Xintong.
> >
> > Regarding separating the interface, I'm also +1 with it.
> >
> > Regarding the resource allocation interface, true, it's dangerous to
> > give much access to user codes. Changing the return type to Map<String
> > key, String/Long value> makes sense to me. AFAIK, it is compatible
> > with all the first-party supported resources for Yarn/Kubernetes. It
> > could also free us from the potential dependency issue as well.
> >
> > Best,
> > Yangze Guo
> >
> > On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <[hidden email]>
> > wrote:
> > >
> > > Thanks for updating the FLIP, Yangze.
> > >
> > > I agree with Till that we probably want to separate the K8s/Yarn
> > decorator
> > > calls. Users can still configure one driver class, and we can use
> > > `instanceof` to check whether the driver implemented K8s/Yarn specific
> > > interfaces.
> > >
> > > Moreover, I'm not sure about exposing entire `ContainerRequest` / `Pod`
> > > (`AbstractKubernetesStepDecorator` directly manipulates on `Pod`) to user
> > > codes. It gives more access to user codes than needed for defining
> > external
> > > resource, which might cause problems. Instead, I would suggest to have
> > > interface like `Map<String key, String value>
> > > getYarn/KubernetesExternalResource()` and assemble them into
> > > `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <[hidden email]>
> > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'm a bit late to the party. I think the current proposal looks good.
> > > >
> > > > Concerning the ExternalResourceDriver interface defined in the FLIP
> > [1], I
> > > > would suggest to not include the decorator calls for Kubernetes and
> > Yarn in
> > > > the base interface. Instead I would suggest to segregate the deployment
> > > > specific decorator calls into separate interfaces. That way an
> > > > ExternalResourceDriver does not have to support all deployments from
> > the
> > > > very beginning. Moreover, some resources might not be supported by a
> > > > specific deployment target and the natural way to express this would
> > be to
> > > > not implement the respective deployment specific interface.
> > > >
> > > > Moreover, having void
> > > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> > containerRequest)
> > > > in the ExternalResourceDriver interface would require Hadoop on Flink's
> > > > classpath whenever the external resource driver is being used.
> > > >
> > > > [1]
> > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <[hidden email]>
> > wrote:
> > > >
> > > > > Nice, thanks a lot!
> > > > >
> > > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <[hidden email]>
> > wrote:
> > > > >
> > > > > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > > > > >
> > > > > > I've updated the FLIP accordingly. I do not add a
> > > > > > ResourceInfoProvider. Instead, I introduce the
> > ExternalResourceDriver,
> > > > > > which takes the responsibility of all relevant operations on both
> > RM
> > > > > > and TM sides.
> > > > > > After a rethink about decoupling the management of external
> > resources
> > > > > > from TaskExecutor, I think we could do the same thing on the
> > > > > > ResourceManager side. We do not need to add a specific allocation
> > > > > > logic to the ResourceManager each time we add a specific external
> > > > > > resource.
> > > > > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > > > > containerRequest.
> > > > > > - For Kubenetes, ExternalResourceDriver could provide a decorator
> > for
> > > > > > the TM pod.
> > > > > >
> > > > > > In this way, just like MetricReporter, we allow users to define
> > their
> > > > > > custom ExternalResourceDriver. It is more extensible and fits the
> > > > > > separation of concerns. For more details, please take a look at
> > [1].
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <[hidden email]>
> > wrote:
> > > > > > >
> > > > > > > This sounds good to go ahead from my side.
> > > > > > >
> > > > > > > I like the approach that Becket suggested - in that case the core
> > > > > > > abstraction that everyone would need to understand would be
> > "external
> > > > > > > resource allocation" and the "ResourceInfoProvider", and the GPU
> > > > > specific
> > > > > > > code would be a specific implementation only known to that
> > component
> > > > > that
> > > > > > > allocates the external resource. That fits the separation of
> > concerns
> > > > > > well.
> > > > > > >
> > > > > > > I also understand that it should not be over-engineered in the
> > first
> > > > > > > version, so some simplification makes sense, and then gradually
> > > > expand
> > > > > > from
> > > > > > > there.
> > > > > > >
> > > > > > > So +1 to go ahead with what was suggested above (Xintong /
> > Becket)
> > > > from
> > > > > > my
> > > > > > > side.
> > > > > > >
> > > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> > [hidden email]>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > > >
> > > > > > > > @Stephan
> > > > > > > >
> > > > > > > > I see your concern, and I completely agree with you that we
> > should
> > > > > > first
> > > > > > > > think about the "library" / "plugin" / "extension" style if
> > > > possible.
> > > > > > > >
> > > > > > > > If GPUs are sliced and assigned during scheduling, there may be
> > > > > reason,
> > > > > > > > > although it looks that it would belong to the slot then. Is
> > that
> > > > > > what we
> > > > > > > > > are doing here?
> > > > > > > >
> > > > > > > >
> > > > > > > > In the current proposal, we do not have the GPUs sliced and
> > > > assigned
> > > > > to
> > > > > > > > slots, because it could be problematic without dynamic slot
> > > > > allocation.
> > > > > > > > E.g., the number of GPUs might not be evenly divisible by the
> > > > number
> > > > > of
> > > > > > > > slots.
> > > > > > > >
> > > > > > > > I think it makes sense to eventually have the GPUs assigned to
> > > > slots.
> > > > > > Even
> > > > > > > > then, we might still need a TM level GPUManager (or
> > > > ResourceProvider
> > > > > > like
> > > > > > > > Becket suggested). For memory, in each slot we can simply
> > request
> > > > the
> > > > > > > > amount of memory, leaving it to JVM / OS to decide which memory
> > > > > > (address)
> > > > > > > > should be assigned. For GPU, and potentially other resources
> > like
> > > > > > FPGA, we
> > > > > > > > need to explicitly specify which GPU (index) should be used.
> > > > > > Therefore, we
> > > > > > > > need some component at the TM level to coordinate which slot
> > uses
> > > > > which
> > > > > > > > GPU.
> > > > > > > >
> > > > > > > > IMO, unless we say Flink will not support slot-level GPU
> > slicing at
> > > > > > least
> > > > > > > > in the foreseeable future, I don't see a good way to avoid
> > touching
> > > > > > the TM
> > > > > > > > core. To that end, I think Becket's suggestion points to a good
> > > > > > direction,
> > > > > > > > that supports more features (GPU, FPGA, etc.) with less
> > coupling to
> > > > > > the TM
> > > > > > > > core (only needs to understand the general interfaces). The
> > > > detailed
> > > > > > > > implementation for specific resource types can even be
> > encapsulated
> > > > > as
> > > > > > a
> > > > > > > > library.
> > > > > > > >
> > > > > > > > @Becket
> > > > > > > >
> > > > > > > > Thanks for sharing your thought on the final state. Despite the
> > > > > > details how
> > > > > > > > the interfaces should look like, I think this is a really good
> > > > > > abstraction
> > > > > > > > for supporting general resource types.
> > > > > > > >
> > > > > > > > I'd like to further clarify that, the following three things
> > are
> > > > all
> > > > > > that
> > > > > > > > the "Flink core" needs to understand.
> > > > > > > >
> > > > > > > > - The *amount* of resource, for scheduling. Actually, we
> > already
> > > > > > have
> > > > > > > > the Resource class in ResourceProfile and ResourceSpec for
> > > > > extended
> > > > > > > > resource. It's just not really used.
> > > > > > > > - The *info*, that Flink provides to the operators / user
> > codes.
> > > > > > > > - The *provider*, which generates the info based on the
> > amount.
> > > > > > > >
> > > > > > > > The "core" does not need to understand the specific
> > implementation
> > > > > > details
> > > > > > > > of the above three. They can even be implemented in a 3rd-party
> > > > > > library.
> > > > > > > > Similar to how we allow users to define their custom
> > > > MetricReporter.
> > > > > > > >
> > > > > > > > Thank you~
> > > > > > > >
> > > > > > > > Xintong Song
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> > [hidden email]>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the comment, Stephan.
> > > > > > > > >
> > > > > > > > > - If everything becomes a "core feature", it will make the
> > > > > project
> > > > > > hard
> > > > > > > > > > to develop in the future. Thinking "library" / "plugin" /
> > > > > > "extension"
> > > > > > > > > style
> > > > > > > > > > where possible helps.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Completely agree. It is much more important to design a
> > mechanism
> > > > > > than
> > > > > > > > > focusing on a specific case. Here is what I am thinking to
> > fully
> > > > > > support
> > > > > > > > > custom resource management:
> > > > > > > > > 1. On the JM / RM side, use ResourceProfile and ResourceSpec
> > to
> > > > > > define
> > > > > > > > the
> > > > > > > > > resource and the amount required. They will be used to find
> > > > > suitable
> > > > > > TMs
> > > > > > > > > slots to run the tasks. At this point, the resources are only
> > > > > > measured by
> > > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > > >
> > > > > > > > > 2. On the TM side, have something like
> > *"ResourceInfoProvider"*
> > > > to
> > > > > > > > identify
> > > > > > > > > and provides the detail information of the individual
> > resource,
> > > > > e.g.
> > > > > > GPU
> > > > > > > > > ID.. It is important because the operator may have to
> > explicitly
> > > > > > interact
> > > > > > > > > with the physical resource it uses. The ResourceInfoProvider
> > > > might
> > > > > > look
> > > > > > > > > like something below.
> > > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId
> > opId,
> > > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > - There could be several "*ResourceInfoProvider*" configured
> > on
> > > > the
> > > > > > TM to
> > > > > > > > > retrieve the information for different resources.
> > > > > > > > > - The TM will be responsible to assign those individual
> > resources
> > > > > to
> > > > > > each
> > > > > > > > > operator according to their requested amount.
> > > > > > > > > - The operators will be able to get the ResourceInfo from
> > their
> > > > > > > > > RuntimeContext.
> > > > > > > > >
> > > > > > > > > If we agree this is a reasonable final state. We can adapt
> > the
> > > > > > current
> > > > > > > > FLIP
> > > > > > > > > to it. In fact it does not sound a big change to me. All the
> > > > > proposed
> > > > > > > > > configuration can be as is, it is just that Flink itself
> > won't
> > > > care
> > > > > > about
> > > > > > > > > them, instead a GPUInfoProviver implementing the
> > > > > ResourceInfoProvider
> > > > > > > > will
> > > > > > > > > use them.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > >
> > > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> > [hidden email]>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all!
> > > > > > > > > >
> > > > > > > > > > The main point I wanted to throw into the discussion is the
> > > > > > following:
> > > > > > > > > > - With more and more use cases, more and more tools go
> > into
> > > > > Flink
> > > > > > > > > > - If everything becomes a "core feature", it will make
> > the
> > > > > > project
> > > > > > > > hard
> > > > > > > > > > to develop in the future. Thinking "library" / "plugin" /
> > > > > > "extension"
> > > > > > > > > style
> > > > > > > > > > where possible helps.
> > > > > > > > > >
> > > > > > > > > > - A good thought experiment is always: How many future
> > > > > developers
> > > > > > > > have
> > > > > > > > > to
> > > > > > > > > > interact with this code (and possibly understand it
> > partially),
> > > > > > even if
> > > > > > > > > the
> > > > > > > > > > features they touch have nothing to do with GPU support. If
> > > > many
> > > > > > > > > > contributors to unrelated features will have to touch it
> > and
> > > > > > understand
> > > > > > > > > it,
> > > > > > > > > > then let's think if there is a different solution. Maybe
> > there
> > > > is
> > > > > > not,
> > > > > > > > > but
> > > > > > > > > > then we should be sure why.
> > > > > > > > > >
> > > > > > > > > > - That led me to raising this issue: If the GPU manager
> > > > > becomes a
> > > > > > > > core
> > > > > > > > > > service in the TaskManager, Environment, RuntimeContext,
> > etc.
> > > > > then
> > > > > > > > > everyone
> > > > > > > > > > developing TM and streaming tasks need to understand the
> > GPU
> > > > > > manager.
> > > > > > > > > That
> > > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > > >
> > > > > > > > > > Access to configuration seems not the right reason to do
> > that.
> > > > We
> > > > > > > > should
> > > > > > > > > > expose the Flink configuration from the RuntimeContext
> > anyways.
> > > > > > > > > >
> > > > > > > > > > If GPUs are sliced and assigned during scheduling, there
> > may be
> > > > > > reason,
> > > > > > > > > > although it looks that it would belong to the slot then. Is
> > > > that
> > > > > > what
> > > > > > > > we
> > > > > > > > > > are doing here?
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Stephan
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > > >
> > > > > > > > > > > IMO, eventually an operator should only see info of GPUs
> > that
> > > > > are
> > > > > > > > > > dedicated
> > > > > > > > > > > for it, instead of all GPUs on the machine/container in
> > the
> > > > > > current
> > > > > > > > > > design.
> > > > > > > > > > > It does not make sense to let the user who writes a UDF
> > to
> > > > > worry
> > > > > > > > about
> > > > > > > > > > > coordination among multiple operators running on the same
> > > > > > machine.
> > > > > > > > And
> > > > > > > > > if
> > > > > > > > > > > we want to limit the GPU info an operator sees, we
> > should not
> > > > > > let the
> > > > > > > > > > > operator to instantiate GPUManager, which means we have
> > to
> > > > > expose
> > > > > > > > > > something
> > > > > > > > > > > through runtime context, either GPU info or some kind of
> > > > > limited
> > > > > > > > access
> > > > > > > > > > to
> > > > > > > > > > > the GPUManager.
> > > > > > > > > > >
> > > > > > > > > > > Thank you~
> > > > > > > > > > >
> > > > > > > > > > > Xintong Song
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > It probably make sense for us to first agree on the
> > final
> > > > > > state.
> > > > > > > > More
> > > > > > > > > > > > specifically, will the resource info be exposed through
> > > > > runtime
> > > > > > > > > context
> > > > > > > > > > > > eventually?
> > > > > > > > > > > >
> > > > > > > > > > > > If that is the final state and we have a seamless
> > migration
> > > > > > story
> > > > > > > > > from
> > > > > > > > > > > this
> > > > > > > > > > > > FLIP to that final state, Personally I think it is OK
> > to
> > > > > > expose the
> > > > > > > > > GPU
> > > > > > > > > > > > info in the runtime context.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > > I think what Stephan means (@Stephan, please correct
> > me
> > > > if
> > > > > > I'm
> > > > > > > > > wrong)
> > > > > > > > > > > is
> > > > > > > > > > > > > that, we might not need to hold and maintain the
> > > > GPUManager
> > > > > > as a
> > > > > > > > > > > service
> > > > > > > > > > > > in
> > > > > > > > > > > > > TaskManagerServices or RuntimeContext. An
> > alternative is
> > > > to
> > > > > > > > create
> > > > > > > > > /
> > > > > > > > > > > > > retrieve the GPUManager only in the operators that
> > need
> > > > it,
> > > > > > e.g.,
> > > > > > > > > > with
> > > > > > > > > > > a
> > > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > > I agree with you on excluding GPUManager from
> > > > > > > > TaskManagerServices.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - For the first step, where we provide unified
> > > > TM-level
> > > > > > GPU
> > > > > > > > > > > > information
> > > > > > > > > > > > > to all operators, it should be fine to have
> > operators
> > > > > > access /
> > > > > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > > > > - In future, we might have some more fine-grained
> > GPU
> > > > > > > > > management,
> > > > > > > > > > > > where
> > > > > > > > > > > > > we need to maintain GPUManager as a service and
> > put
> > > > GPU
> > > > > > info
> > > > > > > > in
> > > > > > > > > > slot
> > > > > > > > > > > > > profiles. But at least for now it's not necessary
> > to
> > > > > > introduce
> > > > > > > > > > such
> > > > > > > > > > > > > complexity.
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, I have some concerns on excluding GPUManager
> > > > from
> > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Configurations needed for creating the
> > GPUManager is
> > > > > not
> > > > > > > > > always
> > > > > > > > > > > > > available for operators.
> > > > > > > > > > > > > - If later we want to have fine-grained control
> > over
> > > > GPU
> > > > > > > > (e.g.,
> > > > > > > > > > > > > operators in each slot can only see GPUs reserved
> > for
> > > > > that
> > > > > > > > > slot),
> > > > > > > > > > > the
> > > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I would suggest to wrap the GPUManager behind
> > > > > RuntimeContext
> > > > > > and
> > > > > > > > > only
> > > > > > > > > > > > > expose the GPUInfo to users. For now, we can declare
> > a
> > > > > method
> > > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a default
> > > > definition
> > > > > > that
> > > > > > > > > > calls
> > > > > > > > > > > > > `GPUManager.get()` to get the lazily-created
> > GPUManager.
> > > > If
> > > > > > later
> > > > > > > > > we
> > > > > > > > > > > want
> > > > > > > > > > > > > to create / retrieve GPUManager in a different way,
> > we
> > > > can
> > > > > > simply
> > > > > > > > > > > change
> > > > > > > > > > > > > how `getGPUInfo` is implemented, without needing to
> > > > change
> > > > > > any
> > > > > > > > > public
> > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> > > > > > [hidden email]>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > > Do you mean Minicluster? Yes, it makes sense to
> > share
> > > > the
> > > > > > GPU
> > > > > > > > > > Manager
> > > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > > If that's what you worry about, I'm +1 for holding
> > > > > > > > > > > > > > GPUManager(ExternalResourceManagers) in
> > TaskExecutor
> > > > > > instead of
> > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regarding the RuntimeContext/FunctionContext, it
> > just
> > > > > > holds the
> > > > > > > > > GPU
> > > > > > > > > > > > > > info instead of the GPU Manager. AFAIK, it's the
> > only
> > > > > > place we
> > > > > > > > > > could
> > > > > > > > > > > > > > pass GPU info to the
> > RichFunction/UserDefinedFunction.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > > > > [hidden email]
> > > > > > > > wrote
> > > > > > > > > > > ----
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Can we somehow keep this out of the
> > TaskManager
> > > > > > services
> > > > > > > > > > > > > > > > I fear that we could not. IMO, the
> > GPUManager(or
> > > > > > > > > > > > > > > > ExternalServicesManagers in future) is
> > conceptually
> > > > > > one of
> > > > > > > > > the
> > > > > > > > > > > task
> > > > > > > > > > > > > > > > manager services, just like MemoryManager
> > before
> > > > > 1.10.
> > > > > > > > > > > > > > > > - It maintains/holds the GPU resource at TM
> > level
> > > > and
> > > > > > all
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > > > > operators allocate the GPU resources from it.
> > So,
> > > > it
> > > > > > should
> > > > > > > > > be
> > > > > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > > > > - We could add a collection called
> > > > > > ExternalResourceManagers
> > > > > > > > > to
> > > > > > > > > > > hold
> > > > > > > > > > > > > > > > all managers of other external resources in the
> > > > > future.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Can you help me understand why this needs the
> > > > addition
> > > > > in
> > > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > > Are you worried about the case when multiple Task
> > > > > > Executors
> > > > > > > > run
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > JVM? That's not common, but wouldn't it actually
> > be
> > > > > good
> > > > > > in
> > > > > > > > > that
> > > > > > > > > > > case
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > share the GPU Manager, given that the GPU is
> > shared?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > > > > > > In this FLIP, operators need the information.
> > Thus,
> > > > > we
> > > > > > > > expose
> > > > > > > > > > GPU
> > > > > > > > > > > > > > > > information to the
> > RuntimeContext/FunctionContext.
> > > > > The
> > > > > > slot
> > > > > > > > > > > profile
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > not aware of GPU resources as GPU is TM level
> > > > > resource
> > > > > > now.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Can the GPU Manager be a "self contained"
> > thing
> > > > > that
> > > > > > > > simply
> > > > > > > > > > > takes
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > configuration, and then abstracts everything
> > > > > > internally?
> > > > > > > > > > > > > > > > Yes, we just pass the path/args of the discover
> > > > > script
> > > > > > and
> > > > > > > > > how
> > > > > > > > > > > many
> > > > > > > > > > > > > > > > GPUs per TM to it. It takes the responsibility
> > to
> > > > get
> > > > > > the
> > > > > > > > GPU
> > > > > > > > > > > > > > > > information and expose them to the
> > > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > Operators. Meanwhile, we'd better not allow
> > > > operators
> > > > > > to
> > > > > > > > > > directly
> > > > > > > > > > > > > > > > access GPUManager, it should get what they want
> > > > from
> > > > > > > > Context.
> > > > > > > > > > We
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > then decouple the interface/implementation of
> > > > > > GPUManager
> > > > > > > > and
> > > > > > > > > > > Public
> > > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan Ewen <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > It sounds fine to initially start with GPU
> > > > specific
> > > > > > > > support
> > > > > > > > > > and
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > generalizing this once we better understand
> > the
> > > > > > space.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > About the implementation suggested in
> > FLIP-108:
> > > > > > > > > > > > > > > > > - Can we somehow keep this out of the
> > TaskManager
> > > > > > > > services?
> > > > > > > > > > > > > Anything
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > have to pull through all layers of the TM
> > makes
> > > > the
> > > > > > TM
> > > > > > > > > > > components
> > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - What parts need information about this?
> > > > > > > > > > > > > > > > > -> do the slot profiles need information
> > about
> > > > the
> > > > > > GPU?
> > > > > > > > > > > > > > > > > -> Can the GPU Manager be a "self contained"
> > > > thing
> > > > > > that
> > > > > > > > > > simply
> > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > the configuration, and then abstracts
> > everything
> > > > > > > > > internally?
> > > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze Guo <
> > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're
> > right,
> > > > > > I'll add
> > > > > > > > > > them
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > > Regarding the general extended resource
> > > > > mechanism,
> > > > > > I
> > > > > > > > > second
> > > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > - It's better to leverage ResourceProfile
> > and
> > > > > > > > > ResourceSpec
> > > > > > > > > > > > after
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > supporting fine-grained GPU scheduling. As
> > a
> > > > > first
> > > > > > step
> > > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > > prefer to not include it in the scope of
> > this
> > > > > FLIP.
> > > > > > > > > > > > > > > > > > - Regarding the "Extended Resource
> > Manager",
> > > > if I
> > > > > > > > > > understand
> > > > > > > > > > > > > > > > > > correctly, it just a code refactoring atm,
> > we
> > > > > could
> > > > > > > > > extract
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > open/close/allocateExtendResources of
> > > > GPUManager
> > > > > to
> > > > > > > > that
> > > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > > that is the case, +1 to do it during
> > > > > > implementation.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > > As Xintong said, we looked into how Spark
> > > > > supports
> > > > > > a
> > > > > > > > > > general
> > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > Resource Scheduling" before and decided to
> > > > > > introduce a
> > > > > > > > > > common
> > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > >
> > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > > to make it more extensible. I think the
> > > > > "resource"
> > > > > > is a
> > > > > > > > > > > proper
> > > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > > to contain all the configs of extended
> > > > resources.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo
> > Huang <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > There is no doubt that GPU resource
> > > > management
> > > > > > > > support
> > > > > > > > > > will
> > > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > > facilitate the development of AI-related
> > > > > > applications
> > > > > > > > > by
> > > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I have only one comment about this wiki:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Regarding the names of several GPU
> > > > > > configurations, I
> > > > > > > > > > think
> > > > > > > > > > > it
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > delete the resource field makes it
> > consistent
> > > > > > with
> > > > > > > > the
> > > > > > > > > > > names
> > > > > > > > > > > > of
> > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > resource-related configurations in
> > > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > e.g.
> > > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > > ->
> > > > > > > > > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]>
> > > > > > 于2020年3月4日周三
> > > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also had
> > an
> > > > > > offline
> > > > > > > > > > > discussion
> > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > > the "GPU Support" as some general
> > "Extended
> > > > > > > > Resource
> > > > > > > > > > > > > Support".
> > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > > supporting extended resources in a
> > general
> > > > > > > > mechanism
> > > > > > > > > is
> > > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > and extensible way. The reason we
> > propose
> > > > > this
> > > > > > FLIP
> > > > > > > > > > > > narrowing
> > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > > down to GPU alone, is mainly for the
> > > > concern
> > > > > on
> > > > > > > > extra
> > > > > > > > > > > > efforts
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > > capacity needed for a general
> > mechanism.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > To come up with a well design on a
> > general
> > > > > > extended
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > mechanism, we would need to investigate
> > > > more
> > > > > > on how
> > > > > > > > > > > people
> > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > > kind of resources in practice. For
> > GPU, we
> > > > > > learnt
> > > > > > > > > such
> > > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > experts, Becket and his team members.
> > But
> > > > for
> > > > > > FPGA,
> > > > > > > > > or
> > > > > > > > > > > > other
> > > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > > extended resources, we don't have such
> > > > > > convenient
> > > > > > > > > > > > information
> > > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > > making the investigation requires more
> > > > > efforts,
> > > > > > > > > which I
> > > > > > > > > > > > tend
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On the other hand, we also looked into
> > how
> > > > > > Spark
> > > > > > > > > > > supports a
> > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > Resource Scheduling". Assuming we want
> > to
> > > > > have
> > > > > > a
> > > > > > > > > > similar
> > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > resource mechanism in the future, we
> > > > believe
> > > > > > that
> > > > > > > > the
> > > > > > > > > > > > current
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > design can be easily extended, in an
> > > > > > incremental
> > > > > > > > way
> > > > > > > > > > > > without
> > > > > > > > > > > > > > too
> > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > - The most important part is probably
> > user
> > > > > > > > > interfaces.
> > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > > configuration options to define the
> > amount,
> > > > > > > > discovery
> > > > > > > > > > > > script
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > > k8s) in a per resource type bias [1],
> > which
> > > > > is
> > > > > > very
> > > > > > > > > > > similar
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > proposed in this FLIP. I think it's not
> > > > > > necessary
> > > > > > > > to
> > > > > > > > > > > expose
> > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > in the general way atm, since we do not
> > > > have
> > > > > > > > supports
> > > > > > > > > > for
> > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > types now. If later we decided to have
> > per
> > > > > > resource
> > > > > > > > > > type
> > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > > can have backwards compatibility on the
> > > > > current
> > > > > > > > > > proposed
> > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > > - For the GPU Manager, if later needed
> > we
> > > > can
> > > > > > > > change
> > > > > > > > > it
> > > > > > > > > > > to
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > > Resource Manager" (or whatever it is
> > > > called).
> > > > > > That
> > > > > > > > > > should
> > > > > > > > > > > > be
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > > > > > > - For ResourceProfile and ResourceSpec,
> > > > there
> > > > > > are
> > > > > > > > > > already
> > > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > > general extended resource. We can of
> > course
> > > > > > > > leverage
> > > > > > > > > > them
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > > fine grained GPU scheduling. That is
> > also
> > > > not
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > scope
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > step proposal, and would require
> > FLIP-56 to
> > > > > be
> > > > > > > > > finished
> > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > To summary up, I agree with Becket that
> > > > have
> > > > > a
> > > > > > > > > separate
> > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > general extended resource mechanism,
> > and
> > > > keep
> > > > > > it in
> > > > > > > > > > mind
> > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM Becket
> > Qin <
> > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > That's a good point, Stephan. It
> > makes
> > > > > total
> > > > > > > > sense
> > > > > > > > > to
> > > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > resource management to support custom
> > > > > > resources.
> > > > > > > > > > Having
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > to add new resources by themselves.
> > The
> > > > > > general
> > > > > > > > > > > resource
> > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 1. The custom resource type
> > definition.
> > > > It
> > > > > is
> > > > > > > > > > supported
> > > > > > > > > > > > by
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > resources in ResourceProfile and
> > > > > > ResourceSpec.
> > > > > > > > This
> > > > > > > > > > > will
> > > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2. The custom resource allocation
> > logic,
> > > > > > i.e. how
> > > > > > > > > to
> > > > > > > > > > > > assign
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > > to different tasks, operators, and
> > so on.
> > > > > > This
> > > > > > > > may
> > > > > > > > > > > > require
> > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > > a. Subtask level - make sure the
> > subtasks
> > > > > > are put
> > > > > > > > > > into
> > > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > > It is done by the global RM and is
> > not
> > > > > > > > customizable
> > > > > > > > > > > right
> > > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > b. Operator level - map the exact
> > > > resource
> > > > > > to the
> > > > > > > > > > > > operators
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for
> > operator
> > > > B.
> > > > > > This
> > > > > > > > > step
> > > > > > > > > > > is
> > > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > > the global RM does not distinguish
> > > > > individual
> > > > > > > > > > resources
> > > > > > > > > > > > of
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > > It is true for memory, but not for
> > GPU.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > The GPU manager is designed to do 2.b
> > > > here.
> > > > > > So it
> > > > > > > > > > > should
> > > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > > physical GPU information and
> > bind/match
> > > > > them
> > > > > > to
> > > > > > > > > each
> > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > general will fill in the missing
> > piece to
> > > > > > support
> > > > > > > > > > > custom
> > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > definition. But I'd avoid calling it
> > a
> > > > > > "External
> > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > > confusion with RM, maybe something
> > like
> > > > > > "Operator
> > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > be more accurate. So for each
> > resource
> > > > type
> > > > > > users
> > > > > > > > > can
> > > > > > > > > > > > have
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > > "Operator Resource Assigner" in the
> > TM.
> > > > For
> > > > > > > > memory,
> > > > > > > > > > > users
> > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > > but for other extended resources,
> > users
> > > > may
> > > > > > need
> > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Personally I think a pluggable
> > "Operator
> > > > > > Resource
> > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > > in this FLIP. But I am also OK with
> > > > having
> > > > > > that
> > > > > > > > in
> > > > > > > > > a
> > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > > the interface between the "Operator
> > > > > Resource
> > > > > > > > > > Assigner"
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > take a while to settle down if we
> > want to
> > > > > > make it
> > > > > > > > > > > > generic.
> > > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > > implementation should take this
> > future
> > > > work
> > > > > > into
> > > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > > don't need to break backwards
> > > > compatibility
> > > > > > once
> > > > > > > > we
> > > > > > > > > > > have
> > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM
> > Stephan
> > > > > Ewen
> > > > > > <
> > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thank you for writing this FLIP.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I cannot really give much input
> > into
> > > > the
> > > > > > > > > mechanics
> > > > > > > > > > of
> > > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > > and GPU allocation, as I have no
> > > > > experience
> > > > > > > > with
> > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > One thought I had when reading the
> > > > > > proposal is
> > > > > > > > if
> > > > > > > > > > it
> > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as an "External
> > > > > Resource
> > > > > > > > > > Manager",
> > > > > > > > > > > > and
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > > The way I understand the
> > > > ResourceProfile
> > > > > > and
> > > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > > It has the advantage that it looks
> > more
> > > > > > > > > extensible.
> > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > Resource, a specialized NVIDIA GPU
> > > > > > Resource,
> > > > > > > > and
> > > > > > > > > > FPGA
> > > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM
> > Becket
> > > > > Qin <
> > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze. GPU
> > > > > resource
> > > > > > > > > > management
> > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > > for machine learning use cases.
> > > > > Actually
> > > > > > it
> > > > > > > > is
> > > > > > > > > > one
> > > > > > > > > > > of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > > question from the users who are
> > > > > > interested in
> > > > > > > > > > using
> > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Some quick comments / questions
> > to
> > > > the
> > > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API should
> > > > probably
> > > > > > also
> > > > > > > > be
> > > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > > 2. Is the data structure that
> > holds
> > > > GPU
> > > > > > info
> > > > > > > > > > also a
> > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15 AM
> > > > Xintong
> > > > > > Song
> > > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thanks for drafting the FLIP
> > and
> > > > > > kicking
> > > > > > > > off
> > > > > > > > > > the
> > > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this feature.
> > Supporting
> > > > > > using
> > > > > > > > of
> > > > > > > > > > GPU
> > > > > > > > > > > in
> > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > > especially for the ML
> > scenarios.
> > > > > > > > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki
> > doc and
> > > > > it
> > > > > > > > looks
> > > > > > > > > > good
> > > > > > > > > > > > to
> > > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > > very good first step for
> > Flink's
> > > > GPU
> > > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at 12:06 PM
> > > > > Yangze
> > > > > > Guo
> > > > > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a
> > > > discussion
> > > > > > > > thread
> > > > > > > > > on
> > > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly discusses
> > the
> > > > > > following
> > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > - Enable user to configure
> > how
> > > > many
> > > > > > GPUs
> > > > > > > > > in a
> > > > > > > > > > > > task
> > > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > forward such requirements to
> > the
> > > > > > external
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos
> > setups).
> > > > > > > > > > > > > > > > > > > > > > > > > - Provide information of
> > > > available
> > > > > > GPU
> > > > > > > > > > > resources
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Key changes proposed in the
> > FLIP
> > > > > are
> > > > > > as
> > > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU resource
> > > > requirements
> > > > > > to
> > > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > > - Introduce GPUManager as
> > one of
> > > > > the
> > > > > > task
> > > > > > > > > > > manager
> > > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > > and expose GPU resource
> > > > information
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > context
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > > - Introduce the default
> > script
> > > > for
> > > > > > GPU
> > > > > > > > > > > discovery,
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > > the privilege mode to help
> > user
> > > > to
> > > > > > > > achieve
> > > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Please find more details in
> > the
> > > > > FLIP
> > > > > > wiki
> > > > > > > > > > > > document
> > > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> >

Till Rohrmann

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

At the moment the RM does not have a user code class loader and I agree
with Stephan that it should stay like this. This, however, does not mean
that we cannot support pluggable components in the RM. As long as the
plugins are on the system's class path, it should be fine for the RM to
load them. For example, we could add external resources via Flink's plugin
mechanism or something similar.

A very simple implementation of such an ExternalResourceDriver could be a
class which simply returns what is written in the flink-conf.yaml under a
given key.

Cheers,
Till

On Mon, Mar 30, 2020 at 5:39 AM Yangze Guo <[hidden email]> wrote:

> Hi, Stephan,
>
> I see your concern and I totally agree with you.
>
> The interface on RM side is now `Map<String key, String/Long value>
> getYarn/KubernetesExternalResource()`. The only valid information RM
> get from it is the configuration key of that external resource in
> Yarn/K8s. The "String/Long value" would be the same as the
> external-resource.{resourceName}.amount.
> So, I think it makes sense to replace these two interfaces with two
> configs, i.e. external-resource.{resourceName}.yarn/kubernetes.key. We
> may lose some extensibility, but AFAIK it could work with common
> external resources like GPU, FPGA. WDYT?
>
> Best,
> Yangze Guo
>
> On Fri, Mar 27, 2020 at 7:59 PM Stephan Ewen <[hidden email]> wrote:
> >
> > Maybe one final comment: It is probably not an issue, but let's try and
> > keep user code (via user code classloader) out of the ResourceManager, if
> > possible.
> >
> > As background:
> >
> > There were thoughts in the past to support setups where the RM must run
> > with "superuser" credentials, but we cannot run JM/TM with these
> > credentials, as the user code might access them otherwise.
> > This is actually possible today, you can run the RM in a different JVM or
> > in a different container, and give it more credentials than JMs / TMs.
> But
> > for this to be feasible, we cannot allow any user-defined code to be in
> the
> > JVM, because that instantaneously breaks the isolation of credentials.
> >
> >
> >
> > On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]> wrote:
> >
> > > Thanks for the feedback, @Till and @Xintong.
> > >
> > > Regarding separating the interface, I'm also +1 with it.
> > >
> > > Regarding the resource allocation interface, true, it's dangerous to
> > > give much access to user codes. Changing the return type to Map<String
> > > key, String/Long value> makes sense to me. AFAIK, it is compatible
> > > with all the first-party supported resources for Yarn/Kubernetes. It
> > > could also free us from the potential dependency issue as well.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <[hidden email]>
> > > wrote:
> > > >
> > > > Thanks for updating the FLIP, Yangze.
> > > >
> > > > I agree with Till that we probably want to separate the K8s/Yarn
> > > decorator
> > > > calls. Users can still configure one driver class, and we can use
> > > > `instanceof` to check whether the driver implemented K8s/Yarn
> specific
> > > > interfaces.
> > > >
> > > > Moreover, I'm not sure about exposing entire `ContainerRequest` /
> `Pod`
> > > > (`AbstractKubernetesStepDecorator` directly manipulates on `Pod`) to
> user
> > > > codes. It gives more access to user codes than needed for defining
> > > external
> > > > resource, which might cause problems. Instead, I would suggest to
> have
> > > > interface like `Map<String key, String value>
> > > > getYarn/KubernetesExternalResource()` and assemble them into
> > > > `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <[hidden email]>
> > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I'm a bit late to the party. I think the current proposal looks
> good.
> > > > >
> > > > > Concerning the ExternalResourceDriver interface defined in the FLIP
> > > [1], I
> > > > > would suggest to not include the decorator calls for Kubernetes and
> > > Yarn in
> > > > > the base interface. Instead I would suggest to segregate the
> deployment
> > > > > specific decorator calls into separate interfaces. That way an
> > > > > ExternalResourceDriver does not have to support all deployments
> from
> > > the
> > > > > very beginning. Moreover, some resources might not be supported by
> a
> > > > > specific deployment target and the natural way to express this
> would
> > > be to
> > > > > not implement the respective deployment specific interface.
> > > > >
> > > > > Moreover, having void
> > > > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> > > containerRequest)
> > > > > in the ExternalResourceDriver interface would require Hadoop on
> Flink's
> > > > > classpath whenever the external resource driver is being used.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <[hidden email]>
> > > wrote:
> > > > >
> > > > > > Nice, thanks a lot!
> > > > > >
> > > > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <[hidden email]>
> > > wrote:
> > > > > >
> > > > > > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > > > > > >
> > > > > > > I've updated the FLIP accordingly. I do not add a
> > > > > > > ResourceInfoProvider. Instead, I introduce the
> > > ExternalResourceDriver,
> > > > > > > which takes the responsibility of all relevant operations on
> both
> > > RM
> > > > > > > and TM sides.
> > > > > > > After a rethink about decoupling the management of external
> > > resources
> > > > > > > from TaskExecutor, I think we could do the same thing on the
> > > > > > > ResourceManager side. We do not need to add a specific
> allocation
> > > > > > > logic to the ResourceManager each time we add a specific
> external
> > > > > > > resource.
> > > > > > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > > > > > containerRequest.
> > > > > > > - For Kubenetes, ExternalResourceDriver could provide a
> decorator
> > > for
> > > > > > > the TM pod.
> > > > > > >
> > > > > > > In this way, just like MetricReporter, we allow users to define
> > > their
> > > > > > > custom ExternalResourceDriver. It is more extensible and fits
> the
> > > > > > > separation of concerns. For more details, please take a look at
> > > [1].
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <[hidden email]
> >
> > > wrote:
> > > > > > > >
> > > > > > > > This sounds good to go ahead from my side.
> > > > > > > >
> > > > > > > > I like the approach that Becket suggested - in that case the
> core
> > > > > > > > abstraction that everyone would need to understand would be
> > > "external
> > > > > > > > resource allocation" and the "ResourceInfoProvider", and the
> GPU
> > > > > > specific
> > > > > > > > code would be a specific implementation only known to that
> > > component
> > > > > > that
> > > > > > > > allocates the external resource. That fits the separation of
> > > concerns
> > > > > > > well.
> > > > > > > >
> > > > > > > > I also understand that it should not be over-engineered in
> the
> > > first
> > > > > > > > version, so some simplification makes sense, and then
> gradually
> > > > > expand
> > > > > > > from
> > > > > > > > there.
> > > > > > > >
> > > > > > > > So +1 to go ahead with what was suggested above (Xintong /
> > > Becket)
> > > > > from
> > > > > > > my
> > > > > > > > side.
> > > > > > > >
> > > > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > > > >
> > > > > > > > > @Stephan
> > > > > > > > >
> > > > > > > > > I see your concern, and I completely agree with you that we
> > > should
> > > > > > > first
> > > > > > > > > think about the "library" / "plugin" / "extension" style if
> > > > > possible.
> > > > > > > > >
> > > > > > > > > If GPUs are sliced and assigned during scheduling, there
> may be
> > > > > > reason,
> > > > > > > > > > although it looks that it would belong to the slot then.
> Is
> > > that
> > > > > > > what we
> > > > > > > > > > are doing here?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > In the current proposal, we do not have the GPUs sliced and
> > > > > assigned
> > > > > > to
> > > > > > > > > slots, because it could be problematic without dynamic slot
> > > > > > allocation.
> > > > > > > > > E.g., the number of GPUs might not be evenly divisible by
> the
> > > > > number
> > > > > > of
> > > > > > > > > slots.
> > > > > > > > >
> > > > > > > > > I think it makes sense to eventually have the GPUs
> assigned to
> > > > > slots.
> > > > > > > Even
> > > > > > > > > then, we might still need a TM level GPUManager (or
> > > > > ResourceProvider
> > > > > > > like
> > > > > > > > > Becket suggested). For memory, in each slot we can simply
> > > request
> > > > > the
> > > > > > > > > amount of memory, leaving it to JVM / OS to decide which
> memory
> > > > > > > (address)
> > > > > > > > > should be assigned. For GPU, and potentially other
> resources
> > > like
> > > > > > > FPGA, we
> > > > > > > > > need to explicitly specify which GPU (index) should be
> used.
> > > > > > > Therefore, we
> > > > > > > > > need some component at the TM level to coordinate which
> slot
> > > uses
> > > > > > which
> > > > > > > > > GPU.
> > > > > > > > >
> > > > > > > > > IMO, unless we say Flink will not support slot-level GPU
> > > slicing at
> > > > > > > least
> > > > > > > > > in the foreseeable future, I don't see a good way to avoid
> > > touching
> > > > > > > the TM
> > > > > > > > > core. To that end, I think Becket's suggestion points to a
> good
> > > > > > > direction,
> > > > > > > > > that supports more features (GPU, FPGA, etc.) with less
> > > coupling to
> > > > > > > the TM
> > > > > > > > > core (only needs to understand the general interfaces). The
> > > > > detailed
> > > > > > > > > implementation for specific resource types can even be
> > > encapsulated
> > > > > > as
> > > > > > > a
> > > > > > > > > library.
> > > > > > > > >
> > > > > > > > > @Becket
> > > > > > > > >
> > > > > > > > > Thanks for sharing your thought on the final state.
> Despite the
> > > > > > > details how
> > > > > > > > > the interfaces should look like, I think this is a really
> good
> > > > > > > abstraction
> > > > > > > > > for supporting general resource types.
> > > > > > > > >
> > > > > > > > > I'd like to further clarify that, the following three
> things
> > > are
> > > > > all
> > > > > > > that
> > > > > > > > > the "Flink core" needs to understand.
> > > > > > > > >
> > > > > > > > > - The *amount* of resource, for scheduling. Actually, we
> > > already
> > > > > > > have
> > > > > > > > > the Resource class in ResourceProfile and ResourceSpec
> for
> > > > > > extended
> > > > > > > > > resource. It's just not really used.
> > > > > > > > > - The *info*, that Flink provides to the operators /
> user
> > > codes.
> > > > > > > > > - The *provider*, which generates the info based on the
> > > amount.
> > > > > > > > >
> > > > > > > > > The "core" does not need to understand the specific
> > > implementation
> > > > > > > details
> > > > > > > > > of the above three. They can even be implemented in a
> 3rd-party
> > > > > > > library.
> > > > > > > > > Similar to how we allow users to define their custom
> > > > > MetricReporter.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for the comment, Stephan.
> > > > > > > > > >
> > > > > > > > > > - If everything becomes a "core feature", it will make
> the
> > > > > > project
> > > > > > > hard
> > > > > > > > > > > to develop in the future. Thinking "library" /
> "plugin" /
> > > > > > > "extension"
> > > > > > > > > > style
> > > > > > > > > > > where possible helps.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Completely agree. It is much more important to design a
> > > mechanism
> > > > > > > than
> > > > > > > > > > focusing on a specific case. Here is what I am thinking
> to
> > > fully
> > > > > > > support
> > > > > > > > > > custom resource management:
> > > > > > > > > > 1. On the JM / RM side, use ResourceProfile and
> ResourceSpec
> > > to
> > > > > > > define
> > > > > > > > > the
> > > > > > > > > > resource and the amount required. They will be used to
> find
> > > > > > suitable
> > > > > > > TMs
> > > > > > > > > > slots to run the tasks. At this point, the resources are
> only
> > > > > > > measured by
> > > > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > > > >
> > > > > > > > > > 2. On the TM side, have something like
> > > *"ResourceInfoProvider"*
> > > > > to
> > > > > > > > > identify
> > > > > > > > > > and provides the detail information of the individual
> > > resource,
> > > > > > e.g.
> > > > > > > GPU
> > > > > > > > > > ID.. It is important because the operator may have to
> > > explicitly
> > > > > > > interact
> > > > > > > > > > with the physical resource it uses. The
> ResourceInfoProvider
> > > > > might
> > > > > > > look
> > > > > > > > > > like something below.
> > > > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > > > Map<AbstractID, INFO> retrieveResourceInfo(OperatorId
> > > opId,
> > > > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > - There could be several "*ResourceInfoProvider*"
> configured
> > > on
> > > > > the
> > > > > > > TM to
> > > > > > > > > > retrieve the information for different resources.
> > > > > > > > > > - The TM will be responsible to assign those individual
> > > resources
> > > > > > to
> > > > > > > each
> > > > > > > > > > operator according to their requested amount.
> > > > > > > > > > - The operators will be able to get the ResourceInfo from
> > > their
> > > > > > > > > > RuntimeContext.
> > > > > > > > > >
> > > > > > > > > > If we agree this is a reasonable final state. We can
> adapt
> > > the
> > > > > > > current
> > > > > > > > > FLIP
> > > > > > > > > > to it. In fact it does not sound a big change to me. All
> the
> > > > > > proposed
> > > > > > > > > > configuration can be as is, it is just that Flink itself
> > > won't
> > > > > care
> > > > > > > about
> > > > > > > > > > them, instead a GPUInfoProviver implementing the
> > > > > > ResourceInfoProvider
> > > > > > > > > will
> > > > > > > > > > use them.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all!
> > > > > > > > > > >
> > > > > > > > > > > The main point I wanted to throw into the discussion
> is the
> > > > > > > following:
> > > > > > > > > > > - With more and more use cases, more and more tools
> go
> > > into
> > > > > > Flink
> > > > > > > > > > > - If everything becomes a "core feature", it will
> make
> > > the
> > > > > > > project
> > > > > > > > > hard
> > > > > > > > > > > to develop in the future. Thinking "library" /
> "plugin" /
> > > > > > > "extension"
> > > > > > > > > > style
> > > > > > > > > > > where possible helps.
> > > > > > > > > > >
> > > > > > > > > > > - A good thought experiment is always: How many
> future
> > > > > > developers
> > > > > > > > > have
> > > > > > > > > > to
> > > > > > > > > > > interact with this code (and possibly understand it
> > > partially),
> > > > > > > even if
> > > > > > > > > > the
> > > > > > > > > > > features they touch have nothing to do with GPU
> support. If
> > > > > many
> > > > > > > > > > > contributors to unrelated features will have to touch
> it
> > > and
> > > > > > > understand
> > > > > > > > > > it,
> > > > > > > > > > > then let's think if there is a different solution.
> Maybe
> > > there
> > > > > is
> > > > > > > not,
> > > > > > > > > > but
> > > > > > > > > > > then we should be sure why.
> > > > > > > > > > >
> > > > > > > > > > > - That led me to raising this issue: If the GPU
> manager
> > > > > > becomes a
> > > > > > > > > core
> > > > > > > > > > > service in the TaskManager, Environment,
> RuntimeContext,
> > > etc.
> > > > > > then
> > > > > > > > > > everyone
> > > > > > > > > > > developing TM and streaming tasks need to understand
> the
> > > GPU
> > > > > > > manager.
> > > > > > > > > > That
> > > > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > > > >
> > > > > > > > > > > Access to configuration seems not the right reason to
> do
> > > that.
> > > > > We
> > > > > > > > > should
> > > > > > > > > > > expose the Flink configuration from the RuntimeContext
> > > anyways.
> > > > > > > > > > >
> > > > > > > > > > > If GPUs are sliced and assigned during scheduling,
> there
> > > may be
> > > > > > > reason,
> > > > > > > > > > > although it looks that it would belong to the slot
> then. Is
> > > > > that
> > > > > > > what
> > > > > > > > > we
> > > > > > > > > > > are doing here?
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Stephan
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > > > > > [hidden email]>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > > > >
> > > > > > > > > > > > IMO, eventually an operator should only see info of
> GPUs
> > > that
> > > > > > are
> > > > > > > > > > > dedicated
> > > > > > > > > > > > for it, instead of all GPUs on the machine/container
> in
> > > the
> > > > > > > current
> > > > > > > > > > > design.
> > > > > > > > > > > > It does not make sense to let the user who writes a
> UDF
> > > to
> > > > > > worry
> > > > > > > > > about
> > > > > > > > > > > > coordination among multiple operators running on the
> same
> > > > > > > machine.
> > > > > > > > > And
> > > > > > > > > > if
> > > > > > > > > > > > we want to limit the GPU info an operator sees, we
> > > should not
> > > > > > > let the
> > > > > > > > > > > > operator to instantiate GPUManager, which means we
> have
> > > to
> > > > > > expose
> > > > > > > > > > > something
> > > > > > > > > > > > through runtime context, either GPU info or some
> kind of
> > > > > > limited
> > > > > > > > > access
> > > > > > > > > > > to
> > > > > > > > > > > > the GPUManager.
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you~
> > > > > > > > > > > >
> > > > > > > > > > > > Xintong Song
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > It probably make sense for us to first agree on the
> > > final
> > > > > > > state.
> > > > > > > > > More
> > > > > > > > > > > > > specifically, will the resource info be exposed
> through
> > > > > > runtime
> > > > > > > > > > context
> > > > > > > > > > > > > eventually?
> > > > > > > > > > > > >
> > > > > > > > > > > > > If that is the final state and we have a seamless
> > > migration
> > > > > > > story
> > > > > > > > > > from
> > > > > > > > > > > > this
> > > > > > > > > > > > > FLIP to that final state, Personally I think it is
> OK
> > > to
> > > > > > > expose the
> > > > > > > > > > GPU
> > > > > > > > > > > > > info in the runtime context.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > > > I think what Stephan means (@Stephan, please
> correct
> > > me
> > > > > if
> > > > > > > I'm
> > > > > > > > > > wrong)
> > > > > > > > > > > > is
> > > > > > > > > > > > > > that, we might not need to hold and maintain the
> > > > > GPUManager
> > > > > > > as a
> > > > > > > > > > > > service
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > TaskManagerServices or RuntimeContext. An
> > > alternative is
> > > > > to
> > > > > > > > > create
> > > > > > > > > > /
> > > > > > > > > > > > > > retrieve the GPUManager only in the operators
> that
> > > need
> > > > > it,
> > > > > > > e.g.,
> > > > > > > > > > > with
> > > > > > > > > > > > a
> > > > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > > > I agree with you on excluding GPUManager from
> > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - For the first step, where we provide unified
> > > > > TM-level
> > > > > > > GPU
> > > > > > > > > > > > > information
> > > > > > > > > > > > > > to all operators, it should be fine to have
> > > operators
> > > > > > > access /
> > > > > > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > > > > > - In future, we might have some more
> fine-grained
> > > GPU
> > > > > > > > > > management,
> > > > > > > > > > > > > where
> > > > > > > > > > > > > > we need to maintain GPUManager as a service
> and
> > > put
> > > > > GPU
> > > > > > > info
> > > > > > > > > in
> > > > > > > > > > > slot
> > > > > > > > > > > > > > profiles. But at least for now it's not
> necessary
> > > to
> > > > > > > introduce
> > > > > > > > > > > such
> > > > > > > > > > > > > > complexity.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > However, I have some concerns on excluding
> GPUManager
> > > > > from
> > > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Configurations needed for creating the
> > > GPUManager is
> > > > > > not
> > > > > > > > > > always
> > > > > > > > > > > > > > available for operators.
> > > > > > > > > > > > > > - If later we want to have fine-grained
> control
> > > over
> > > > > GPU
> > > > > > > > > (e.g.,
> > > > > > > > > > > > > > operators in each slot can only see GPUs
> reserved
> > > for
> > > > > > that
> > > > > > > > > > slot),
> > > > > > > > > > > > the
> > > > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I would suggest to wrap the GPUManager behind
> > > > > > RuntimeContext
> > > > > > > and
> > > > > > > > > > only
> > > > > > > > > > > > > > expose the GPUInfo to users. For now, we can
> declare
> > > a
> > > > > > method
> > > > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a default
> > > > > definition
> > > > > > > that
> > > > > > > > > > > calls
> > > > > > > > > > > > > > `GPUManager.get()` to get the lazily-created
> > > GPUManager.
> > > > > If
> > > > > > > later
> > > > > > > > > > we
> > > > > > > > > > > > want
> > > > > > > > > > > > > > to create / retrieve GPUManager in a different
> way,
> > > we
> > > > > can
> > > > > > > simply
> > > > > > > > > > > > change
> > > > > > > > > > > > > > how `getGPUInfo` is implemented, without needing
> to
> > > > > change
> > > > > > > any
> > > > > > > > > > public
> > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > > > Do you mean Minicluster? Yes, it makes sense to
> > > share
> > > > > the
> > > > > > > GPU
> > > > > > > > > > > Manager
> > > > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > > > If that's what you worry about, I'm +1 for
> holding
> > > > > > > > > > > > > > > GPUManager(ExternalResourceManagers) in
> > > TaskExecutor
> > > > > > > instead of
> > > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regarding the RuntimeContext/FunctionContext,
> it
> > > just
> > > > > > > holds the
> > > > > > > > > > GPU
> > > > > > > > > > > > > > > info instead of the GPU Manager. AFAIK, it's
> the
> > > only
> > > > > > > place we
> > > > > > > > > > > could
> > > > > > > > > > > > > > > pass GPU info to the
> > > RichFunction/UserDefinedFunction.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac Godfried
> <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > > > > > [hidden email]
> > > > > > > > > wrote
> > > > > > > > > > > > ----
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Can we somehow keep this out of the
> > > TaskManager
> > > > > > > services
> > > > > > > > > > > > > > > > > I fear that we could not. IMO, the
> > > GPUManager(or
> > > > > > > > > > > > > > > > > ExternalServicesManagers in future) is
> > > conceptually
> > > > > > > one of
> > > > > > > > > > the
> > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > manager services, just like MemoryManager
> > > before
> > > > > > 1.10.
> > > > > > > > > > > > > > > > > - It maintains/holds the GPU resource at TM
> > > level
> > > > > and
> > > > > > > all
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > operators allocate the GPU resources from
> it.
> > > So,
> > > > > it
> > > > > > > should
> > > > > > > > > > be
> > > > > > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > > > > > - We could add a collection called
> > > > > > > ExternalResourceManagers
> > > > > > > > > > to
> > > > > > > > > > > > hold
> > > > > > > > > > > > > > > > > all managers of other external resources
> in the
> > > > > > future.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Can you help me understand why this needs the
> > > > > addition
> > > > > > in
> > > > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > > > Are you worried about the case when multiple
> Task
> > > > > > > Executors
> > > > > > > > > run
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > JVM? That's not common, but wouldn't it
> actually
> > > be
> > > > > > good
> > > > > > > in
> > > > > > > > > > that
> > > > > > > > > > > > case
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > share the GPU Manager, given that the GPU is
> > > shared?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > > > > > > > In this FLIP, operators need the
> information.
> > > Thus,
> > > > > > we
> > > > > > > > > expose
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > information to the
> > > RuntimeContext/FunctionContext.
> > > > > > The
> > > > > > > slot
> > > > > > > > > > > > profile
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > not aware of GPU resources as GPU is TM
> level
> > > > > > resource
> > > > > > > now.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Can the GPU Manager be a "self contained"
> > > thing
> > > > > > that
> > > > > > > > > simply
> > > > > > > > > > > > takes
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > configuration, and then abstracts
> everything
> > > > > > > internally?
> > > > > > > > > > > > > > > > > Yes, we just pass the path/args of the
> discover
> > > > > > script
> > > > > > > and
> > > > > > > > > > how
> > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > GPUs per TM to it. It takes the
> responsibility
> > > to
> > > > > get
> > > > > > > the
> > > > > > > > > GPU
> > > > > > > > > > > > > > > > > information and expose them to the
> > > > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > Operators. Meanwhile, we'd better not allow
> > > > > operators
> > > > > > > to
> > > > > > > > > > > directly
> > > > > > > > > > > > > > > > > access GPUManager, it should get what they
> want
> > > > > from
> > > > > > > > > Context.
> > > > > > > > > > > We
> > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > then decouple the interface/implementation
> of
> > > > > > > GPUManager
> > > > > > > > > and
> > > > > > > > > > > > Public
> > > > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan
> Ewen <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > It sounds fine to initially start with
> GPU
> > > > > specific
> > > > > > > > > support
> > > > > > > > > > > and
> > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > generalizing this once we better
> understand
> > > the
> > > > > > > space.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > About the implementation suggested in
> > > FLIP-108:
> > > > > > > > > > > > > > > > > > - Can we somehow keep this out of the
> > > TaskManager
> > > > > > > > > services?
> > > > > > > > > > > > > > Anything
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > have to pull through all layers of the TM
> > > makes
> > > > > the
> > > > > > > TM
> > > > > > > > > > > > components
> > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > - What parts need information about this?
> > > > > > > > > > > > > > > > > > -> do the slot profiles need information
> > > about
> > > > > the
> > > > > > > GPU?
> > > > > > > > > > > > > > > > > > -> Can the GPU Manager be a "self
> contained"
> > > > > thing
> > > > > > > that
> > > > > > > > > > > simply
> > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > the configuration, and then abstracts
> > > everything
> > > > > > > > > > internally?
> > > > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze
> Guo <
> > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo, you're
> > > right,
> > > > > > > I'll add
> > > > > > > > > > > them
> > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > > > Regarding the general extended resource
> > > > > > mechanism,
> > > > > > > I
> > > > > > > > > > second
> > > > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > > - It's better to leverage
> ResourceProfile
> > > and
> > > > > > > > > > ResourceSpec
> > > > > > > > > > > > > after
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > supporting fine-grained GPU
> scheduling. As
> > > a
> > > > > > first
> > > > > > > step
> > > > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > > > prefer to not include it in the scope
> of
> > > this
> > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > - Regarding the "Extended Resource
> > > Manager",
> > > > > if I
> > > > > > > > > > > understand
> > > > > > > > > > > > > > > > > > > correctly, it just a code refactoring
> atm,
> > > we
> > > > > > could
> > > > > > > > > > extract
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > open/close/allocateExtendResources of
> > > > > GPUManager
> > > > > > to
> > > > > > > > > that
> > > > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > > > that is the case, +1 to do it during
> > > > > > > implementation.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > > > As Xintong said, we looked into how
> Spark
> > > > > > supports
> > > > > > > a
> > > > > > > > > > > general
> > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > Resource Scheduling" before and
> decided to
> > > > > > > introduce a
> > > > > > > > > > > common
> > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > >
> > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > > > to make it more extensible. I think the
> > > > > > "resource"
> > > > > > > is a
> > > > > > > > > > > > proper
> > > > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > > > to contain all the configs of extended
> > > > > resources.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM Xingbo
> > > Huang <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > There is no doubt that GPU resource
> > > > > management
> > > > > > > > > support
> > > > > > > > > > > will
> > > > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > > > facilitate the development of
> AI-related
> > > > > > > applications
> > > > > > > > > > by
> > > > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I have only one comment about this
> wiki:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Regarding the names of several GPU
> > > > > > > configurations, I
> > > > > > > > > > > think
> > > > > > > > > > > > it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > delete the resource field makes it
> > > consistent
> > > > > > > with
> > > > > > > > > the
> > > > > > > > > > > > names
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > resource-related configurations in
> > > > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > > > ->
> > > > > > > > > > > > > > > > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Xintong Song <[hidden email]>
> > > > > > > 于2020年3月4日周三
> > > > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also
> had
> > > an
> > > > > > > offline
> > > > > > > > > > > > discussion
> > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > > > the "GPU Support" as some general
> > > "Extended
> > > > > > > > > Resource
> > > > > > > > > > > > > > Support".
> > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > > > supporting extended resources in a
> > > general
> > > > > > > > > mechanism
> > > > > > > > > > is
> > > > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > > and extensible way. The reason we
> > > propose
> > > > > > this
> > > > > > > FLIP
> > > > > > > > > > > > > narrowing
> > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > > > down to GPU alone, is mainly for
> the
> > > > > concern
> > > > > > on
> > > > > > > > > extra
> > > > > > > > > > > > > efforts
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > > > capacity needed for a general
> > > mechanism.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > To come up with a well design on a
> > > general
> > > > > > > extended
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > mechanism, we would need to
> investigate
> > > > > more
> > > > > > > on how
> > > > > > > > > > > > people
> > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > > > kind of resources in practice. For
> > > GPU, we
> > > > > > > learnt
> > > > > > > > > > such
> > > > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > experts, Becket and his team
> members.
> > > But
> > > > > for
> > > > > > > FPGA,
> > > > > > > > > > or
> > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > > > extended resources, we don't have
> such
> > > > > > > convenient
> > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > > > making the investigation requires
> more
> > > > > > efforts,
> > > > > > > > > > which I
> > > > > > > > > > > > > tend
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On the other hand, we also looked
> into
> > > how
> > > > > > > Spark
> > > > > > > > > > > > supports a
> > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > Resource Scheduling". Assuming we
> want
> > > to
> > > > > > have
> > > > > > > a
> > > > > > > > > > > similar
> > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > resource mechanism in the future,
> we
> > > > > believe
> > > > > > > that
> > > > > > > > > the
> > > > > > > > > > > > > current
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > design can be easily extended, in
> an
> > > > > > > incremental
> > > > > > > > > way
> > > > > > > > > > > > > without
> > > > > > > > > > > > > > > too
> > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > - The most important part is
> probably
> > > user
> > > > > > > > > > interfaces.
> > > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > > > configuration options to define the
> > > amount,
> > > > > > > > > discovery
> > > > > > > > > > > > > script
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > > > k8s) in a per resource type bias
> [1],
> > > which
> > > > > > is
> > > > > > > very
> > > > > > > > > > > > similar
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > proposed in this FLIP. I think
> it's not
> > > > > > > necessary
> > > > > > > > > to
> > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > in the general way atm, since we
> do not
> > > > > have
> > > > > > > > > supports
> > > > > > > > > > > for
> > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > types now. If later we decided to
> have
> > > per
> > > > > > > resource
> > > > > > > > > > > type
> > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > > > can have backwards compatibility
> on the
> > > > > > current
> > > > > > > > > > > proposed
> > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > > > - For the GPU Manager, if later
> needed
> > > we
> > > > > can
> > > > > > > > > change
> > > > > > > > > > it
> > > > > > > > > > > > to
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > > > Resource Manager" (or whatever it
> is
> > > > > called).
> > > > > > > That
> > > > > > > > > > > should
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > > > > > > > - For ResourceProfile and
> ResourceSpec,
> > > > > there
> > > > > > > are
> > > > > > > > > > > already
> > > > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > > > general extended resource. We can
> of
> > > course
> > > > > > > > > leverage
> > > > > > > > > > > them
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > > > fine grained GPU scheduling. That
> is
> > > also
> > > > > not
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > scope
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > > step proposal, and would require
> > > FLIP-56 to
> > > > > > be
> > > > > > > > > > finished
> > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > To summary up, I agree with Becket
> that
> > > > > have
> > > > > > a
> > > > > > > > > > separate
> > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > general extended resource
> mechanism,
> > > and
> > > > > keep
> > > > > > > it in
> > > > > > > > > > > mind
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM
> Becket
> > > Qin <
> > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > That's a good point, Stephan. It
> > > makes
> > > > > > total
> > > > > > > > > sense
> > > > > > > > > > to
> > > > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > resource management to support
> custom
> > > > > > > resources.
> > > > > > > > > > > Having
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > > to add new resources by
> themselves.
> > > The
> > > > > > > general
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 1. The custom resource type
> > > definition.
> > > > > It
> > > > > > is
> > > > > > > > > > > supported
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > resources in ResourceProfile and
> > > > > > > ResourceSpec.
> > > > > > > > > This
> > > > > > > > > > > > will
> > > > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 2. The custom resource allocation
> > > logic,
> > > > > > > i.e. how
> > > > > > > > > > to
> > > > > > > > > > > > > assign
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > > > to different tasks, operators,
> and
> > > so on.
> > > > > > > This
> > > > > > > > > may
> > > > > > > > > > > > > require
> > > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > > > a. Subtask level - make sure the
> > > subtasks
> > > > > > > are put
> > > > > > > > > > > into
> > > > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > > > It is done by the global RM and
> is
> > > not
> > > > > > > > > customizable
> > > > > > > > > > > > right
> > > > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > > b. Operator level - map the exact
> > > > > resource
> > > > > > > to the
> > > > > > > > > > > > > operators
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for
> > > operator
> > > > > B.
> > > > > > > This
> > > > > > > > > > step
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > > > the global RM does not
> distinguish
> > > > > > individual
> > > > > > > > > > > resources
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > > > It is true for memory, but not
> for
> > > GPU.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > The GPU manager is designed to
> do 2.b
> > > > > here.
> > > > > > > So it
> > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > > > physical GPU information and
> > > bind/match
> > > > > > them
> > > > > > > to
> > > > > > > > > > each
> > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > general will fill in the missing
> > > piece to
> > > > > > > support
> > > > > > > > > > > > custom
> > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > > definition. But I'd avoid
> calling it
> > > a
> > > > > > > "External
> > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > > > confusion with RM, maybe
> something
> > > like
> > > > > > > "Operator
> > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > > be more accurate. So for each
> > > resource
> > > > > type
> > > > > > > users
> > > > > > > > > > can
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > > > "Operator Resource Assigner" in
> the
> > > TM.
> > > > > For
> > > > > > > > > memory,
> > > > > > > > > > > > users
> > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > > > but for other extended resources,
> > > users
> > > > > may
> > > > > > > need
> > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Personally I think a pluggable
> > > "Operator
> > > > > > > Resource
> > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > > > in this FLIP. But I am also OK
> with
> > > > > having
> > > > > > > that
> > > > > > > > > in
> > > > > > > > > > a
> > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > > > the interface between the
> "Operator
> > > > > > Resource
> > > > > > > > > > > Assigner"
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > take a while to settle down if we
> > > want to
> > > > > > > make it
> > > > > > > > > > > > > generic.
> > > > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > > > implementation should take this
> > > future
> > > > > work
> > > > > > > into
> > > > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > > > don't need to break backwards
> > > > > compatibility
> > > > > > > once
> > > > > > > > > we
> > > > > > > > > > > > have
> > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM
> > > Stephan
> > > > > > Ewen
> > > > > > > <
> > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thank you for writing this
> FLIP.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > I cannot really give much input
> > > into
> > > > > the
> > > > > > > > > > mechanics
> > > > > > > > > > > of
> > > > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > > > and GPU allocation, as I have
> no
> > > > > > experience
> > > > > > > > > with
> > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > One thought I had when reading
> the
> > > > > > > proposal is
> > > > > > > > > if
> > > > > > > > > > > it
> > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as an
> "External
> > > > > > Resource
> > > > > > > > > > > Manager",
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > > > The way I understand the
> > > > > ResourceProfile
> > > > > > > and
> > > > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > > > It has the advantage that it
> looks
> > > more
> > > > > > > > > > extensible.
> > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > Resource, a specialized NVIDIA
> GPU
> > > > > > > Resource,
> > > > > > > > > and
> > > > > > > > > > > FPGA
> > > > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57 AM
> > > Becket
> > > > > > Qin <
> > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze.
> GPU
> > > > > > resource
> > > > > > > > > > > management
> > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > > > for machine learning use
> cases.
> > > > > > Actually
> > > > > > > it
> > > > > > > > > is
> > > > > > > > > > > one
> > > > > > > > > > > > of
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > > > question from the users who
> are
> > > > > > > interested in
> > > > > > > > > > > using
> > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Some quick comments /
> questions
> > > to
> > > > > the
> > > > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API
> should
> > > > > probably
> > > > > > > also
> > > > > > > > > be
> > > > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > > > 2. Is the data structure that
> > > holds
> > > > > GPU
> > > > > > > info
> > > > > > > > > > > also a
> > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 10:15
> AM
> > > > > Xintong
> > > > > > > Song
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thanks for drafting the
> FLIP
> > > and
> > > > > > > kicking
> > > > > > > > > off
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this feature.
> > > Supporting
> > > > > > > using
> > > > > > > > > of
> > > > > > > > > > > GPU
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > > > especially for the ML
> > > scenarios.
> > > > > > > > > > > > > > > > > > > > > > > > > I've reviewed the FLIP wiki
> > > doc and
> > > > > > it
> > > > > > > > > looks
> > > > > > > > > > > good
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > > > very good first step for
> > > Flink's
> > > > > GPU
> > > > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at
> 12:06 PM
> > > > > > Yangze
> > > > > > > Guo
> > > > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start a
> > > > > discussion
> > > > > > > > > thread
> > > > > > > > > > on
> > > > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly
> discusses
> > > the
> > > > > > > following
> > > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > - Enable user to
> configure
> > > how
> > > > > many
> > > > > > > GPUs
> > > > > > > > > > in a
> > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > > forward such
> requirements to
> > > the
> > > > > > > external
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos
> > > setups).
> > > > > > > > > > > > > > > > > > > > > > > > > > - Provide information of
> > > > > available
> > > > > > > GPU
> > > > > > > > > > > > resources
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Key changes proposed in
> the
> > > FLIP
> > > > > > are
> > > > > > > as
> > > > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU resource
> > > > > requirements
> > > > > > > to
> > > > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce GPUManager as
> > > one of
> > > > > > the
> > > > > > > task
> > > > > > > > > > > > manager
> > > > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > > > and expose GPU resource
> > > > > information
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > > context
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce the default
> > > script
> > > > > for
> > > > > > > GPU
> > > > > > > > > > > > discovery,
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > > > the privilege mode to
> help
> > > user
> > > > > to
> > > > > > > > > achieve
> > > > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Please find more details
> in
> > > the
> > > > > > FLIP
> > > > > > > wiki
> > > > > > > > > > > > > document
> > > > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
>

Xintong Song

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

I also agree that the pluggable ExternalResourceDriver should be loaded by
the cluster class loader. Despite the plugin might be implemented by users,
external resources (as part of task executor resources) should be cluster
configurations, unlike job-level user codes such as UDFs, because the task
executors belongs to the cluster rather than jobs.

IIUC, the concern Stephan raised is about the potential credential problem
when executing user codes on RM with cluster class loader. The concern
makes sense to me, and I think what Yangze suggested should be a good
approach trying to prevent such credential problems. The only purpose we
tried to execute user codes (i.e. getKubernetes/YarnExternalResource) on RM
was that, we need to set these key-value pairs to pod/container requests.
Replacing the interfaces getKubernetes/YarnExternalResource with
configuration options
'external-resource.{resourceName}.yarn/kubernetes.key/amount',
we can still fulfill that purpose, without the credential risks.

Thank you~

Xintong Song

On Mon, Mar 30, 2020 at 5:17 PM Till Rohrmann <[hidden email]> wrote:

> At the moment the RM does not have a user code class loader and I agree
> with Stephan that it should stay like this. This, however, does not mean
> that we cannot support pluggable components in the RM. As long as the
> plugins are on the system's class path, it should be fine for the RM to
> load them. For example, we could add external resources via Flink's plugin
> mechanism or something similar.
>
> A very simple implementation of such an ExternalResourceDriver could be a
> class which simply returns what is written in the flink-conf.yaml under a
> given key.
>
> Cheers,
> Till
>
> On Mon, Mar 30, 2020 at 5:39 AM Yangze Guo <[hidden email]> wrote:
>
> > Hi, Stephan,
> >
> > I see your concern and I totally agree with you.
> >
> > The interface on RM side is now `Map<String key, String/Long value>
> > getYarn/KubernetesExternalResource()`. The only valid information RM
> > get from it is the configuration key of that external resource in
> > Yarn/K8s. The "String/Long value" would be the same as the
> > external-resource.{resourceName}.amount.
> > So, I think it makes sense to replace these two interfaces with two
> > configs, i.e. external-resource.{resourceName}.yarn/kubernetes.key. We
> > may lose some extensibility, but AFAIK it could work with common
> > external resources like GPU, FPGA. WDYT?
> >
> > Best,
> > Yangze Guo
> >
> > On Fri, Mar 27, 2020 at 7:59 PM Stephan Ewen <[hidden email]> wrote:
> > >
> > > Maybe one final comment: It is probably not an issue, but let's try and
> > > keep user code (via user code classloader) out of the ResourceManager,
> if
> > > possible.
> > >
> > > As background:
> > >
> > > There were thoughts in the past to support setups where the RM must run
> > > with "superuser" credentials, but we cannot run JM/TM with these
> > > credentials, as the user code might access them otherwise.
> > > This is actually possible today, you can run the RM in a different JVM
> or
> > > in a different container, and give it more credentials than JMs / TMs.
> > But
> > > for this to be feasible, we cannot allow any user-defined code to be in
> > the
> > > JVM, because that instantaneously breaks the isolation of credentials.
> > >
> > >
> > >
> > > On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]> wrote:
> > >
> > > > Thanks for the feedback, @Till and @Xintong.
> > > >
> > > > Regarding separating the interface, I'm also +1 with it.
> > > >
> > > > Regarding the resource allocation interface, true, it's dangerous to
> > > > give much access to user codes. Changing the return type to
> Map<String
> > > > key, String/Long value> makes sense to me. AFAIK, it is compatible
> > > > with all the first-party supported resources for Yarn/Kubernetes. It
> > > > could also free us from the potential dependency issue as well.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <[hidden email]
> >
> > > > wrote:
> > > > >
> > > > > Thanks for updating the FLIP, Yangze.
> > > > >
> > > > > I agree with Till that we probably want to separate the K8s/Yarn
> > > > decorator
> > > > > calls. Users can still configure one driver class, and we can use
> > > > > `instanceof` to check whether the driver implemented K8s/Yarn
> > specific
> > > > > interfaces.
> > > > >
> > > > > Moreover, I'm not sure about exposing entire `ContainerRequest` /
> > `Pod`
> > > > > (`AbstractKubernetesStepDecorator` directly manipulates on `Pod`)
> to
> > user
> > > > > codes. It gives more access to user codes than needed for defining
> > > > external
> > > > > resource, which might cause problems. Instead, I would suggest to
> > have
> > > > > interface like `Map<String key, String value>
> > > > > getYarn/KubernetesExternalResource()` and assemble them into
> > > > > `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <
> [hidden email]>
> > > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I'm a bit late to the party. I think the current proposal looks
> > good.
> > > > > >
> > > > > > Concerning the ExternalResourceDriver interface defined in the
> FLIP
> > > > [1], I
> > > > > > would suggest to not include the decorator calls for Kubernetes
> and
> > > > Yarn in
> > > > > > the base interface. Instead I would suggest to segregate the
> > deployment
> > > > > > specific decorator calls into separate interfaces. That way an
> > > > > > ExternalResourceDriver does not have to support all deployments
> > from
> > > > the
> > > > > > very beginning. Moreover, some resources might not be supported
> by
> > a
> > > > > > specific deployment target and the natural way to express this
> > would
> > > > be to
> > > > > > not implement the respective deployment specific interface.
> > > > > >
> > > > > > Moreover, having void
> > > > > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> > > > containerRequest)
> > > > > > in the ExternalResourceDriver interface would require Hadoop on
> > Flink's
> > > > > > classpath whenever the external resource driver is being used.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <[hidden email]>
> > > > wrote:
> > > > > >
> > > > > > > Nice, thanks a lot!
> > > > > > >
> > > > > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <
> [hidden email]>
> > > > wrote:
> > > > > > >
> > > > > > > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > > > > > > >
> > > > > > > > I've updated the FLIP accordingly. I do not add a
> > > > > > > > ResourceInfoProvider. Instead, I introduce the
> > > > ExternalResourceDriver,
> > > > > > > > which takes the responsibility of all relevant operations on
> > both
> > > > RM
> > > > > > > > and TM sides.
> > > > > > > > After a rethink about decoupling the management of external
> > > > resources
> > > > > > > > from TaskExecutor, I think we could do the same thing on the
> > > > > > > > ResourceManager side. We do not need to add a specific
> > allocation
> > > > > > > > logic to the ResourceManager each time we add a specific
> > external
> > > > > > > > resource.
> > > > > > > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > > > > > > containerRequest.
> > > > > > > > - For Kubenetes, ExternalResourceDriver could provide a
> > decorator
> > > > for
> > > > > > > > the TM pod.
> > > > > > > >
> > > > > > > > In this way, just like MetricReporter, we allow users to
> define
> > > > their
> > > > > > > > custom ExternalResourceDriver. It is more extensible and fits
> > the
> > > > > > > > separation of concerns. For more details, please take a look
> at
> > > > [1].
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <
> [hidden email]
> > >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > This sounds good to go ahead from my side.
> > > > > > > > >
> > > > > > > > > I like the approach that Becket suggested - in that case
> the
> > core
> > > > > > > > > abstraction that everyone would need to understand would be
> > > > "external
> > > > > > > > > resource allocation" and the "ResourceInfoProvider", and
> the
> > GPU
> > > > > > > specific
> > > > > > > > > code would be a specific implementation only known to that
> > > > component
> > > > > > > that
> > > > > > > > > allocates the external resource. That fits the separation
> of
> > > > concerns
> > > > > > > > well.
> > > > > > > > >
> > > > > > > > > I also understand that it should not be over-engineered in
> > the
> > > > first
> > > > > > > > > version, so some simplification makes sense, and then
> > gradually
> > > > > > expand
> > > > > > > > from
> > > > > > > > > there.
> > > > > > > > >
> > > > > > > > > So +1 to go ahead with what was suggested above (Xintong /
> > > > Becket)
> > > > > > from
> > > > > > > > my
> > > > > > > > > side.
> > > > > > > > >
> > > > > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > > > > >
> > > > > > > > > > @Stephan
> > > > > > > > > >
> > > > > > > > > > I see your concern, and I completely agree with you that
> we
> > > > should
> > > > > > > > first
> > > > > > > > > > think about the "library" / "plugin" / "extension" style
> if
> > > > > > possible.
> > > > > > > > > >
> > > > > > > > > > If GPUs are sliced and assigned during scheduling, there
> > may be
> > > > > > > reason,
> > > > > > > > > > > although it looks that it would belong to the slot
> then.
> > Is
> > > > that
> > > > > > > > what we
> > > > > > > > > > > are doing here?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > In the current proposal, we do not have the GPUs sliced
> and
> > > > > > assigned
> > > > > > > to
> > > > > > > > > > slots, because it could be problematic without dynamic
> slot
> > > > > > > allocation.
> > > > > > > > > > E.g., the number of GPUs might not be evenly divisible by
> > the
> > > > > > number
> > > > > > > of
> > > > > > > > > > slots.
> > > > > > > > > >
> > > > > > > > > > I think it makes sense to eventually have the GPUs
> > assigned to
> > > > > > slots.
> > > > > > > > Even
> > > > > > > > > > then, we might still need a TM level GPUManager (or
> > > > > > ResourceProvider
> > > > > > > > like
> > > > > > > > > > Becket suggested). For memory, in each slot we can simply
> > > > request
> > > > > > the
> > > > > > > > > > amount of memory, leaving it to JVM / OS to decide which
> > memory
> > > > > > > > (address)
> > > > > > > > > > should be assigned. For GPU, and potentially other
> > resources
> > > > like
> > > > > > > > FPGA, we
> > > > > > > > > > need to explicitly specify which GPU (index) should be
> > used.
> > > > > > > > Therefore, we
> > > > > > > > > > need some component at the TM level to coordinate which
> > slot
> > > > uses
> > > > > > > which
> > > > > > > > > > GPU.
> > > > > > > > > >
> > > > > > > > > > IMO, unless we say Flink will not support slot-level GPU
> > > > slicing at
> > > > > > > > least
> > > > > > > > > > in the foreseeable future, I don't see a good way to
> avoid
> > > > touching
> > > > > > > > the TM
> > > > > > > > > > core. To that end, I think Becket's suggestion points to
> a
> > good
> > > > > > > > direction,
> > > > > > > > > > that supports more features (GPU, FPGA, etc.) with less
> > > > coupling to
> > > > > > > > the TM
> > > > > > > > > > core (only needs to understand the general interfaces).
> The
> > > > > > detailed
> > > > > > > > > > implementation for specific resource types can even be
> > > > encapsulated
> > > > > > > as
> > > > > > > > a
> > > > > > > > > > library.
> > > > > > > > > >
> > > > > > > > > > @Becket
> > > > > > > > > >
> > > > > > > > > > Thanks for sharing your thought on the final state.
> > Despite the
> > > > > > > > details how
> > > > > > > > > > the interfaces should look like, I think this is a really
> > good
> > > > > > > > abstraction
> > > > > > > > > > for supporting general resource types.
> > > > > > > > > >
> > > > > > > > > > I'd like to further clarify that, the following three
> > things
> > > > are
> > > > > > all
> > > > > > > > that
> > > > > > > > > > the "Flink core" needs to understand.
> > > > > > > > > >
> > > > > > > > > > - The *amount* of resource, for scheduling. Actually,
> we
> > > > already
> > > > > > > > have
> > > > > > > > > > the Resource class in ResourceProfile and ResourceSpec
> > for
> > > > > > > extended
> > > > > > > > > > resource. It's just not really used.
> > > > > > > > > > - The *info*, that Flink provides to the operators /
> > user
> > > > codes.
> > > > > > > > > > - The *provider*, which generates the info based on
> the
> > > > amount.
> > > > > > > > > >
> > > > > > > > > > The "core" does not need to understand the specific
> > > > implementation
> > > > > > > > details
> > > > > > > > > > of the above three. They can even be implemented in a
> > 3rd-party
> > > > > > > > library.
> > > > > > > > > > Similar to how we allow users to define their custom
> > > > > > MetricReporter.
> > > > > > > > > >
> > > > > > > > > > Thank you~
> > > > > > > > > >
> > > > > > > > > > Xintong Song
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for the comment, Stephan.
> > > > > > > > > > >
> > > > > > > > > > > - If everything becomes a "core feature", it will
> make
> > the
> > > > > > > project
> > > > > > > > hard
> > > > > > > > > > > > to develop in the future. Thinking "library" /
> > "plugin" /
> > > > > > > > "extension"
> > > > > > > > > > > style
> > > > > > > > > > > > where possible helps.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Completely agree. It is much more important to design a
> > > > mechanism
> > > > > > > > than
> > > > > > > > > > > focusing on a specific case. Here is what I am thinking
> > to
> > > > fully
> > > > > > > > support
> > > > > > > > > > > custom resource management:
> > > > > > > > > > > 1. On the JM / RM side, use ResourceProfile and
> > ResourceSpec
> > > > to
> > > > > > > > define
> > > > > > > > > > the
> > > > > > > > > > > resource and the amount required. They will be used to
> > find
> > > > > > > suitable
> > > > > > > > TMs
> > > > > > > > > > > slots to run the tasks. At this point, the resources
> are
> > only
> > > > > > > > measured by
> > > > > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > > > > >
> > > > > > > > > > > 2. On the TM side, have something like
> > > > *"ResourceInfoProvider"*
> > > > > > to
> > > > > > > > > > identify
> > > > > > > > > > > and provides the detail information of the individual
> > > > resource,
> > > > > > > e.g.
> > > > > > > > GPU
> > > > > > > > > > > ID.. It is important because the operator may have to
> > > > explicitly
> > > > > > > > interact
> > > > > > > > > > > with the physical resource it uses. The
> > ResourceInfoProvider
> > > > > > might
> > > > > > > > look
> > > > > > > > > > > like something below.
> > > > > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > > > > Map<AbstractID, INFO>
> retrieveResourceInfo(OperatorId
> > > > opId,
> > > > > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > - There could be several "*ResourceInfoProvider*"
> > configured
> > > > on
> > > > > > the
> > > > > > > > TM to
> > > > > > > > > > > retrieve the information for different resources.
> > > > > > > > > > > - The TM will be responsible to assign those individual
> > > > resources
> > > > > > > to
> > > > > > > > each
> > > > > > > > > > > operator according to their requested amount.
> > > > > > > > > > > - The operators will be able to get the ResourceInfo
> from
> > > > their
> > > > > > > > > > > RuntimeContext.
> > > > > > > > > > >
> > > > > > > > > > > If we agree this is a reasonable final state. We can
> > adapt
> > > > the
> > > > > > > > current
> > > > > > > > > > FLIP
> > > > > > > > > > > to it. In fact it does not sound a big change to me.
> All
> > the
> > > > > > > proposed
> > > > > > > > > > > configuration can be as is, it is just that Flink
> itself
> > > > won't
> > > > > > care
> > > > > > > > about
> > > > > > > > > > > them, instead a GPUInfoProviver implementing the
> > > > > > > ResourceInfoProvider
> > > > > > > > > > will
> > > > > > > > > > > use them.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all!
> > > > > > > > > > > >
> > > > > > > > > > > > The main point I wanted to throw into the discussion
> > is the
> > > > > > > > following:
> > > > > > > > > > > > - With more and more use cases, more and more tools
> > go
> > > > into
> > > > > > > Flink
> > > > > > > > > > > > - If everything becomes a "core feature", it will
> > make
> > > > the
> > > > > > > > project
> > > > > > > > > > hard
> > > > > > > > > > > > to develop in the future. Thinking "library" /
> > "plugin" /
> > > > > > > > "extension"
> > > > > > > > > > > style
> > > > > > > > > > > > where possible helps.
> > > > > > > > > > > >
> > > > > > > > > > > > - A good thought experiment is always: How many
> > future
> > > > > > > developers
> > > > > > > > > > have
> > > > > > > > > > > to
> > > > > > > > > > > > interact with this code (and possibly understand it
> > > > partially),
> > > > > > > > even if
> > > > > > > > > > > the
> > > > > > > > > > > > features they touch have nothing to do with GPU
> > support. If
> > > > > > many
> > > > > > > > > > > > contributors to unrelated features will have to touch
> > it
> > > > and
> > > > > > > > understand
> > > > > > > > > > > it,
> > > > > > > > > > > > then let's think if there is a different solution.
> > Maybe
> > > > there
> > > > > > is
> > > > > > > > not,
> > > > > > > > > > > but
> > > > > > > > > > > > then we should be sure why.
> > > > > > > > > > > >
> > > > > > > > > > > > - That led me to raising this issue: If the GPU
> > manager
> > > > > > > becomes a
> > > > > > > > > > core
> > > > > > > > > > > > service in the TaskManager, Environment,
> > RuntimeContext,
> > > > etc.
> > > > > > > then
> > > > > > > > > > > everyone
> > > > > > > > > > > > developing TM and streaming tasks need to understand
> > the
> > > > GPU
> > > > > > > > manager.
> > > > > > > > > > > That
> > > > > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > > > > >
> > > > > > > > > > > > Access to configuration seems not the right reason to
> > do
> > > > that.
> > > > > > We
> > > > > > > > > > should
> > > > > > > > > > > > expose the Flink configuration from the
> RuntimeContext
> > > > anyways.
> > > > > > > > > > > >
> > > > > > > > > > > > If GPUs are sliced and assigned during scheduling,
> > there
> > > > may be
> > > > > > > > reason,
> > > > > > > > > > > > although it looks that it would belong to the slot
> > then. Is
> > > > > > that
> > > > > > > > what
> > > > > > > > > > we
> > > > > > > > > > > > are doing here?
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Stephan
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > > > > >
> > > > > > > > > > > > > IMO, eventually an operator should only see info of
> > GPUs
> > > > that
> > > > > > > are
> > > > > > > > > > > > dedicated
> > > > > > > > > > > > > for it, instead of all GPUs on the
> machine/container
> > in
> > > > the
> > > > > > > > current
> > > > > > > > > > > > design.
> > > > > > > > > > > > > It does not make sense to let the user who writes a
> > UDF
> > > > to
> > > > > > > worry
> > > > > > > > > > about
> > > > > > > > > > > > > coordination among multiple operators running on
> the
> > same
> > > > > > > > machine.
> > > > > > > > > > And
> > > > > > > > > > > if
> > > > > > > > > > > > > we want to limit the GPU info an operator sees, we
> > > > should not
> > > > > > > > let the
> > > > > > > > > > > > > operator to instantiate GPUManager, which means we
> > have
> > > > to
> > > > > > > expose
> > > > > > > > > > > > something
> > > > > > > > > > > > > through runtime context, either GPU info or some
> > kind of
> > > > > > > limited
> > > > > > > > > > access
> > > > > > > > > > > > to
> > > > > > > > > > > > > the GPUManager.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > It probably make sense for us to first agree on
> the
> > > > final
> > > > > > > > state.
> > > > > > > > > > More
> > > > > > > > > > > > > > specifically, will the resource info be exposed
> > through
> > > > > > > runtime
> > > > > > > > > > > context
> > > > > > > > > > > > > > eventually?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If that is the final state and we have a seamless
> > > > migration
> > > > > > > > story
> > > > > > > > > > > from
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > FLIP to that final state, Personally I think it
> is
> > OK
> > > > to
> > > > > > > > expose the
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > info in the runtime context.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > > > > I think what Stephan means (@Stephan, please
> > correct
> > > > me
> > > > > > if
> > > > > > > > I'm
> > > > > > > > > > > wrong)
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > that, we might not need to hold and maintain
> the
> > > > > > GPUManager
> > > > > > > > as a
> > > > > > > > > > > > > service
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > TaskManagerServices or RuntimeContext. An
> > > > alternative is
> > > > > > to
> > > > > > > > > > create
> > > > > > > > > > > /
> > > > > > > > > > > > > > > retrieve the GPUManager only in the operators
> > that
> > > > need
> > > > > > it,
> > > > > > > > e.g.,
> > > > > > > > > > > > with
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > > > > I agree with you on excluding GPUManager from
> > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - For the first step, where we provide
> unified
> > > > > > TM-level
> > > > > > > > GPU
> > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > to all operators, it should be fine to have
> > > > operators
> > > > > > > > access /
> > > > > > > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > > > > > > - In future, we might have some more
> > fine-grained
> > > > GPU
> > > > > > > > > > > management,
> > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > we need to maintain GPUManager as a service
> > and
> > > > put
> > > > > > GPU
> > > > > > > > info
> > > > > > > > > > in
> > > > > > > > > > > > slot
> > > > > > > > > > > > > > > profiles. But at least for now it's not
> > necessary
> > > > to
> > > > > > > > introduce
> > > > > > > > > > > > such
> > > > > > > > > > > > > > > complexity.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > However, I have some concerns on excluding
> > GPUManager
> > > > > > from
> > > > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - Configurations needed for creating the
> > > > GPUManager is
> > > > > > > not
> > > > > > > > > > > always
> > > > > > > > > > > > > > > available for operators.
> > > > > > > > > > > > > > > - If later we want to have fine-grained
> > control
> > > > over
> > > > > > GPU
> > > > > > > > > > (e.g.,
> > > > > > > > > > > > > > > operators in each slot can only see GPUs
> > reserved
> > > > for
> > > > > > > that
> > > > > > > > > > > slot),
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I would suggest to wrap the GPUManager behind
> > > > > > > RuntimeContext
> > > > > > > > and
> > > > > > > > > > > only
> > > > > > > > > > > > > > > expose the GPUInfo to users. For now, we can
> > declare
> > > > a
> > > > > > > method
> > > > > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a
> default
> > > > > > definition
> > > > > > > > that
> > > > > > > > > > > > calls
> > > > > > > > > > > > > > > `GPUManager.get()` to get the lazily-created
> > > > GPUManager.
> > > > > > If
> > > > > > > > later
> > > > > > > > > > > we
> > > > > > > > > > > > > want
> > > > > > > > > > > > > > > to create / retrieve GPUManager in a different
> > way,
> > > > we
> > > > > > can
> > > > > > > > simply
> > > > > > > > > > > > > change
> > > > > > > > > > > > > > > how `getGPUInfo` is implemented, without
> needing
> > to
> > > > > > change
> > > > > > > > any
> > > > > > > > > > > public
> > > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > > > > Do you mean Minicluster? Yes, it makes sense
> to
> > > > share
> > > > > > the
> > > > > > > > GPU
> > > > > > > > > > > > Manager
> > > > > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > > > > If that's what you worry about, I'm +1 for
> > holding
> > > > > > > > > > > > > > > > GPUManager(ExternalResourceManagers) in
> > > > TaskExecutor
> > > > > > > > instead of
> > > > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regarding the RuntimeContext/FunctionContext,
> > it
> > > > just
> > > > > > > > holds the
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > info instead of the GPU Manager. AFAIK, it's
> > the
> > > > only
> > > > > > > > place we
> > > > > > > > > > > > could
> > > > > > > > > > > > > > > > pass GPU info to the
> > > > RichFunction/UserDefinedFunction.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac
> Godfried
> > <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > > > > > > [hidden email]
> > > > > > > > > > wrote
> > > > > > > > > > > > > ----
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Can we somehow keep this out of the
> > > > TaskManager
> > > > > > > > services
> > > > > > > > > > > > > > > > > > I fear that we could not. IMO, the
> > > > GPUManager(or
> > > > > > > > > > > > > > > > > > ExternalServicesManagers in future) is
> > > > conceptually
> > > > > > > > one of
> > > > > > > > > > > the
> > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > manager services, just like MemoryManager
> > > > before
> > > > > > > 1.10.
> > > > > > > > > > > > > > > > > > - It maintains/holds the GPU resource at
> TM
> > > > level
> > > > > > and
> > > > > > > > all
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > operators allocate the GPU resources from
> > it.
> > > > So,
> > > > > > it
> > > > > > > > should
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > > > > > > - We could add a collection called
> > > > > > > > ExternalResourceManagers
> > > > > > > > > > > to
> > > > > > > > > > > > > hold
> > > > > > > > > > > > > > > > > > all managers of other external resources
> > in the
> > > > > > > future.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Can you help me understand why this needs
> the
> > > > > > addition
> > > > > > > in
> > > > > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > > > > Are you worried about the case when
> multiple
> > Task
> > > > > > > > Executors
> > > > > > > > > > run
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > JVM? That's not common, but wouldn't it
> > actually
> > > > be
> > > > > > > good
> > > > > > > > in
> > > > > > > > > > > that
> > > > > > > > > > > > > case
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > share the GPU Manager, given that the GPU
> is
> > > > shared?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > > > > > > > > In this FLIP, operators need the
> > information.
> > > > Thus,
> > > > > > > we
> > > > > > > > > > expose
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > information to the
> > > > RuntimeContext/FunctionContext.
> > > > > > > The
> > > > > > > > slot
> > > > > > > > > > > > > profile
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > not aware of GPU resources as GPU is TM
> > level
> > > > > > > resource
> > > > > > > > now.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Can the GPU Manager be a "self
> contained"
> > > > thing
> > > > > > > that
> > > > > > > > > > simply
> > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > configuration, and then abstracts
> > everything
> > > > > > > > internally?
> > > > > > > > > > > > > > > > > > Yes, we just pass the path/args of the
> > discover
> > > > > > > script
> > > > > > > > and
> > > > > > > > > > > how
> > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > GPUs per TM to it. It takes the
> > responsibility
> > > > to
> > > > > > get
> > > > > > > > the
> > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > information and expose them to the
> > > > > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > Operators. Meanwhile, we'd better not
> allow
> > > > > > operators
> > > > > > > > to
> > > > > > > > > > > > directly
> > > > > > > > > > > > > > > > > > access GPUManager, it should get what
> they
> > want
> > > > > > from
> > > > > > > > > > Context.
> > > > > > > > > > > > We
> > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > then decouple the
> interface/implementation
> > of
> > > > > > > > GPUManager
> > > > > > > > > > and
> > > > > > > > > > > > > Public
> > > > > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan
> > Ewen <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > It sounds fine to initially start with
> > GPU
> > > > > > specific
> > > > > > > > > > support
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > generalizing this once we better
> > understand
> > > > the
> > > > > > > > space.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > About the implementation suggested in
> > > > FLIP-108:
> > > > > > > > > > > > > > > > > > > - Can we somehow keep this out of the
> > > > TaskManager
> > > > > > > > > > services?
> > > > > > > > > > > > > > > Anything
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > have to pull through all layers of the
> TM
> > > > makes
> > > > > > the
> > > > > > > > TM
> > > > > > > > > > > > > components
> > > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > - What parts need information about
> this?
> > > > > > > > > > > > > > > > > > > -> do the slot profiles need
> information
> > > > about
> > > > > > the
> > > > > > > > GPU?
> > > > > > > > > > > > > > > > > > > -> Can the GPU Manager be a "self
> > contained"
> > > > > > thing
> > > > > > > > that
> > > > > > > > > > > > simply
> > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > > the configuration, and then abstracts
> > > > everything
> > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze
> > Guo <
> > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo,
> you're
> > > > right,
> > > > > > > > I'll add
> > > > > > > > > > > > them
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > > > > Regarding the general extended
> resource
> > > > > > > mechanism,
> > > > > > > > I
> > > > > > > > > > > second
> > > > > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > > > - It's better to leverage
> > ResourceProfile
> > > > and
> > > > > > > > > > > ResourceSpec
> > > > > > > > > > > > > > after
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > supporting fine-grained GPU
> > scheduling. As
> > > > a
> > > > > > > first
> > > > > > > > step
> > > > > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > > > > prefer to not include it in the scope
> > of
> > > > this
> > > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > - Regarding the "Extended Resource
> > > > Manager",
> > > > > > if I
> > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > > > > > correctly, it just a code refactoring
> > atm,
> > > > we
> > > > > > > could
> > > > > > > > > > > extract
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > open/close/allocateExtendResources of
> > > > > > GPUManager
> > > > > > > to
> > > > > > > > > > that
> > > > > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > > > > that is the case, +1 to do it during
> > > > > > > > implementation.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > > > > As Xintong said, we looked into how
> > Spark
> > > > > > > supports
> > > > > > > > a
> > > > > > > > > > > > general
> > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > Resource Scheduling" before and
> > decided to
> > > > > > > > introduce a
> > > > > > > > > > > > common
> > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > >
> > > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > > > > to make it more extensible. I think
> the
> > > > > > > "resource"
> > > > > > > > is a
> > > > > > > > > > > > > proper
> > > > > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > > > > to contain all the configs of
> extended
> > > > > > resources.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM
> Xingbo
> > > > Huang <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > There is no doubt that GPU resource
> > > > > > management
> > > > > > > > > > support
> > > > > > > > > > > > will
> > > > > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > > > > facilitate the development of
> > AI-related
> > > > > > > > applications
> > > > > > > > > > > by
> > > > > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I have only one comment about this
> > wiki:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Regarding the names of several GPU
> > > > > > > > configurations, I
> > > > > > > > > > > > think
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > delete the resource field makes it
> > > > consistent
> > > > > > > > with
> > > > > > > > > > the
> > > > > > > > > > > > > names
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > resource-related configurations in
> > > > > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > > > > ->
> > > > > > > > > > > > > > > > > > > > >
> taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Xintong Song <
> [hidden email]>
> > > > > > > > 于2020年3月4日周三
> > > > > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also
> > had
> > > > an
> > > > > > > > offline
> > > > > > > > > > > > > discussion
> > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > > > > the "GPU Support" as some general
> > > > "Extended
> > > > > > > > > > Resource
> > > > > > > > > > > > > > > Support".
> > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > > > > supporting extended resources in
> a
> > > > general
> > > > > > > > > > mechanism
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > > > and extensible way. The reason we
> > > > propose
> > > > > > > this
> > > > > > > > FLIP
> > > > > > > > > > > > > > narrowing
> > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > > > > down to GPU alone, is mainly for
> > the
> > > > > > concern
> > > > > > > on
> > > > > > > > > > extra
> > > > > > > > > > > > > > efforts
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > > > > capacity needed for a general
> > > > mechanism.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > To come up with a well design on
> a
> > > > general
> > > > > > > > extended
> > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > mechanism, we would need to
> > investigate
> > > > > > more
> > > > > > > > on how
> > > > > > > > > > > > > people
> > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > > > > kind of resources in practice.
> For
> > > > GPU, we
> > > > > > > > learnt
> > > > > > > > > > > such
> > > > > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > experts, Becket and his team
> > members.
> > > > But
> > > > > > for
> > > > > > > > FPGA,
> > > > > > > > > > > or
> > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > > > > extended resources, we don't have
> > such
> > > > > > > > convenient
> > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > > > > making the investigation requires
> > more
> > > > > > > efforts,
> > > > > > > > > > > which I
> > > > > > > > > > > > > > tend
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On the other hand, we also looked
> > into
> > > > how
> > > > > > > > Spark
> > > > > > > > > > > > > supports a
> > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > Resource Scheduling". Assuming we
> > want
> > > > to
> > > > > > > have
> > > > > > > > a
> > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > resource mechanism in the future,
> > we
> > > > > > believe
> > > > > > > > that
> > > > > > > > > > the
> > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > design can be easily extended, in
> > an
> > > > > > > > incremental
> > > > > > > > > > way
> > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > too
> > > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > - The most important part is
> > probably
> > > > user
> > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > > > > configuration options to define
> the
> > > > amount,
> > > > > > > > > > discovery
> > > > > > > > > > > > > > script
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > > > > k8s) in a per resource type bias
> > [1],
> > > > which
> > > > > > > is
> > > > > > > > very
> > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > proposed in this FLIP. I think
> > it's not
> > > > > > > > necessary
> > > > > > > > > > to
> > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > in the general way atm, since we
> > do not
> > > > > > have
> > > > > > > > > > supports
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > types now. If later we decided to
> > have
> > > > per
> > > > > > > > resource
> > > > > > > > > > > > type
> > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > > > > can have backwards compatibility
> > on the
> > > > > > > current
> > > > > > > > > > > > proposed
> > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > > > > - For the GPU Manager, if later
> > needed
> > > > we
> > > > > > can
> > > > > > > > > > change
> > > > > > > > > > > it
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > > > > Resource Manager" (or whatever it
> > is
> > > > > > called).
> > > > > > > > That
> > > > > > > > > > > > should
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > > > > > > > > - For ResourceProfile and
> > ResourceSpec,
> > > > > > there
> > > > > > > > are
> > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > > > > general extended resource. We can
> > of
> > > > course
> > > > > > > > > > leverage
> > > > > > > > > > > > them
> > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > > > > fine grained GPU scheduling. That
> > is
> > > > also
> > > > > > not
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > > scope
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > > > step proposal, and would require
> > > > FLIP-56 to
> > > > > > > be
> > > > > > > > > > > finished
> > > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > To summary up, I agree with
> Becket
> > that
> > > > > > have
> > > > > > > a
> > > > > > > > > > > separate
> > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > general extended resource
> > mechanism,
> > > > and
> > > > > > keep
> > > > > > > > it in
> > > > > > > > > > > > mind
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM
> > Becket
> > > > Qin <
> > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > That's a good point, Stephan.
> It
> > > > makes
> > > > > > > total
> > > > > > > > > > sense
> > > > > > > > > > > to
> > > > > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > resource management to support
> > custom
> > > > > > > > resources.
> > > > > > > > > > > > Having
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > > > to add new resources by
> > themselves.
> > > > The
> > > > > > > > general
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > 1. The custom resource type
> > > > definition.
> > > > > > It
> > > > > > > is
> > > > > > > > > > > > supported
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > resources in ResourceProfile
> and
> > > > > > > > ResourceSpec.
> > > > > > > > > > This
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > 2. The custom resource
> allocation
> > > > logic,
> > > > > > > > i.e. how
> > > > > > > > > > > to
> > > > > > > > > > > > > > assign
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > > > > to different tasks, operators,
> > and
> > > > so on.
> > > > > > > > This
> > > > > > > > > > may
> > > > > > > > > > > > > > require
> > > > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > > > > a. Subtask level - make sure
> the
> > > > subtasks
> > > > > > > > are put
> > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > > > > It is done by the global RM and
> > is
> > > > not
> > > > > > > > > > customizable
> > > > > > > > > > > > > right
> > > > > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > > > b. Operator level - map the
> exact
> > > > > > resource
> > > > > > > > to the
> > > > > > > > > > > > > > operators
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for
> > > > operator
> > > > > > B.
> > > > > > > > This
> > > > > > > > > > > step
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > > > > the global RM does not
> > distinguish
> > > > > > > individual
> > > > > > > > > > > > resources
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > > > > It is true for memory, but not
> > for
> > > > GPU.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > The GPU manager is designed to
> > do 2.b
> > > > > > here.
> > > > > > > > So it
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > > > > physical GPU information and
> > > > bind/match
> > > > > > > them
> > > > > > > > to
> > > > > > > > > > > each
> > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > general will fill in the
> missing
> > > > piece to
> > > > > > > > support
> > > > > > > > > > > > > custom
> > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > > > definition. But I'd avoid
> > calling it
> > > > a
> > > > > > > > "External
> > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > > > > confusion with RM, maybe
> > something
> > > > like
> > > > > > > > "Operator
> > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > > > be more accurate. So for each
> > > > resource
> > > > > > type
> > > > > > > > users
> > > > > > > > > > > can
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > > > > "Operator Resource Assigner" in
> > the
> > > > TM.
> > > > > > For
> > > > > > > > > > memory,
> > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > > > > but for other extended
> resources,
> > > > users
> > > > > > may
> > > > > > > > need
> > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Personally I think a pluggable
> > > > "Operator
> > > > > > > > Resource
> > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > > > > in this FLIP. But I am also OK
> > with
> > > > > > having
> > > > > > > > that
> > > > > > > > > > in
> > > > > > > > > > > a
> > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > > > > the interface between the
> > "Operator
> > > > > > > Resource
> > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > take a while to settle down if
> we
> > > > want to
> > > > > > > > make it
> > > > > > > > > > > > > > generic.
> > > > > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > > > > implementation should take this
> > > > future
> > > > > > work
> > > > > > > > into
> > > > > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > > > > don't need to break backwards
> > > > > > compatibility
> > > > > > > > once
> > > > > > > > > > we
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM
> > > > Stephan
> > > > > > > Ewen
> > > > > > > > <
> > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thank you for writing this
> > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > I cannot really give much
> input
> > > > into
> > > > > > the
> > > > > > > > > > > mechanics
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > > > > and GPU allocation, as I have
> > no
> > > > > > > experience
> > > > > > > > > > with
> > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > One thought I had when
> reading
> > the
> > > > > > > > proposal is
> > > > > > > > > > if
> > > > > > > > > > > > it
> > > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as an
> > "External
> > > > > > > Resource
> > > > > > > > > > > > Manager",
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > > > > The way I understand the
> > > > > > ResourceProfile
> > > > > > > > and
> > > > > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > > > > It has the advantage that it
> > looks
> > > > more
> > > > > > > > > > > extensible.
> > > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > Resource, a specialized
> NVIDIA
> > GPU
> > > > > > > > Resource,
> > > > > > > > > > and
> > > > > > > > > > > > FPGA
> > > > > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57
> AM
> > > > Becket
> > > > > > > Qin <
> > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze.
> > GPU
> > > > > > > resource
> > > > > > > > > > > > management
> > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > > > > for machine learning use
> > cases.
> > > > > > > Actually
> > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > > one
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > > > > question from the users who
> > are
> > > > > > > > interested in
> > > > > > > > > > > > using
> > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Some quick comments /
> > questions
> > > > to
> > > > > > the
> > > > > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API
> > should
> > > > > > probably
> > > > > > > > also
> > > > > > > > > > be
> > > > > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > > > > 2. Is the data structure
> that
> > > > holds
> > > > > > GPU
> > > > > > > > info
> > > > > > > > > > > > also a
> > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at
> 10:15
> > AM
> > > > > > Xintong
> > > > > > > > Song
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for drafting the
> > FLIP
> > > > and
> > > > > > > > kicking
> > > > > > > > > > off
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this feature.
> > > > Supporting
> > > > > > > > using
> > > > > > > > > > of
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > > > > especially for the ML
> > > > scenarios.
> > > > > > > > > > > > > > > > > > > > > > > > > > I've reviewed the FLIP
> wiki
> > > > doc and
> > > > > > > it
> > > > > > > > > > looks
> > > > > > > > > > > > good
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > > > > very good first step for
> > > > Flink's
> > > > > > GPU
> > > > > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at
> > 12:06 PM
> > > > > > > Yangze
> > > > > > > > Guo
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start
> a
> > > > > > discussion
> > > > > > > > > > thread
> > > > > > > > > > > on
> > > > > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly
> > discusses
> > > > the
> > > > > > > > following
> > > > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > - Enable user to
> > configure
> > > > how
> > > > > > many
> > > > > > > > GPUs
> > > > > > > > > > > in a
> > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > > > forward such
> > requirements to
> > > > the
> > > > > > > > external
> > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos
> > > > setups).
> > > > > > > > > > > > > > > > > > > > > > > > > > > - Provide information
> of
> > > > > > available
> > > > > > > > GPU
> > > > > > > > > > > > > resources
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes proposed in
> > the
> > > > FLIP
> > > > > > > are
> > > > > > > > as
> > > > > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU resource
> > > > > > requirements
> > > > > > > > to
> > > > > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce GPUManager
> as
> > > > one of
> > > > > > > the
> > > > > > > > task
> > > > > > > > > > > > > manager
> > > > > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > > > > and expose GPU resource
> > > > > > information
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce the default
> > > > script
> > > > > > for
> > > > > > > > GPU
> > > > > > > > > > > > > discovery,
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > > > > the privilege mode to
> > help
> > > > user
> > > > > > to
> > > > > > > > > > achieve
> > > > > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more
> details
> > in
> > > > the
> > > > > > > FLIP
> > > > > > > > wiki
> > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
>

Yangze Guo

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Hi @Till, @Xintong

I think even without the credential concerns, replacing the interfaces
with configuration options is a good idea from my side.
- Currently, I don't see any external resource does not compatible
with this mechanism
- It reduces the burden of users to implement a plugin themselves.
WDYT?

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 5:44 PM Xintong Song <[hidden email]> wrote:

>
> I also agree that the pluggable ExternalResourceDriver should be loaded by
> the cluster class loader. Despite the plugin might be implemented by users,
> external resources (as part of task executor resources) should be cluster
> configurations, unlike job-level user codes such as UDFs, because the task
> executors belongs to the cluster rather than jobs.
>
>
> IIUC, the concern Stephan raised is about the potential credential problem
> when executing user codes on RM with cluster class loader. The concern
> makes sense to me, and I think what Yangze suggested should be a good
> approach trying to prevent such credential problems. The only purpose we
> tried to execute user codes (i.e. getKubernetes/YarnExternalResource) on RM
> was that, we need to set these key-value pairs to pod/container requests.
> Replacing the interfaces getKubernetes/YarnExternalResource with
> configuration options
> 'external-resource.{resourceName}.yarn/kubernetes.key/amount',
> we can still fulfill that purpose, without the credential risks.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Mar 30, 2020 at 5:17 PM Till Rohrmann <[hidden email]> wrote:
>
> > At the moment the RM does not have a user code class loader and I agree
> > with Stephan that it should stay like this. This, however, does not mean
> > that we cannot support pluggable components in the RM. As long as the
> > plugins are on the system's class path, it should be fine for the RM to
> > load them. For example, we could add external resources via Flink's plugin
> > mechanism or something similar.
> >
> > A very simple implementation of such an ExternalResourceDriver could be a
> > class which simply returns what is written in the flink-conf.yaml under a
> > given key.
> >
> > Cheers,
> > Till
> >
> > On Mon, Mar 30, 2020 at 5:39 AM Yangze Guo <[hidden email]> wrote:
> >
> > > Hi, Stephan,
> > >
> > > I see your concern and I totally agree with you.
> > >
> > > The interface on RM side is now `Map<String key, String/Long value>
> > > getYarn/KubernetesExternalResource()`. The only valid information RM
> > > get from it is the configuration key of that external resource in
> > > Yarn/K8s. The "String/Long value" would be the same as the
> > > external-resource.{resourceName}.amount.
> > > So, I think it makes sense to replace these two interfaces with two
> > > configs, i.e. external-resource.{resourceName}.yarn/kubernetes.key. We
> > > may lose some extensibility, but AFAIK it could work with common
> > > external resources like GPU, FPGA. WDYT?
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Fri, Mar 27, 2020 at 7:59 PM Stephan Ewen <[hidden email]> wrote:
> > > >
> > > > Maybe one final comment: It is probably not an issue, but let's try and
> > > > keep user code (via user code classloader) out of the ResourceManager,
> > if
> > > > possible.
> > > >
> > > > As background:
> > > >
> > > > There were thoughts in the past to support setups where the RM must run
> > > > with "superuser" credentials, but we cannot run JM/TM with these
> > > > credentials, as the user code might access them otherwise.
> > > > This is actually possible today, you can run the RM in a different JVM
> > or
> > > > in a different container, and give it more credentials than JMs / TMs.
> > > But
> > > > for this to be feasible, we cannot allow any user-defined code to be in
> > > the
> > > > JVM, because that instantaneously breaks the isolation of credentials.
> > > >
> > > >
> > > >
> > > > On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]> wrote:
> > > >
> > > > > Thanks for the feedback, @Till and @Xintong.
> > > > >
> > > > > Regarding separating the interface, I'm also +1 with it.
> > > > >
> > > > > Regarding the resource allocation interface, true, it's dangerous to
> > > > > give much access to user codes. Changing the return type to
> > Map<String
> > > > > key, String/Long value> makes sense to me. AFAIK, it is compatible
> > > > > with all the first-party supported resources for Yarn/Kubernetes. It
> > > > > could also free us from the potential dependency issue as well.
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <[hidden email]
> > >
> > > > > wrote:
> > > > > >
> > > > > > Thanks for updating the FLIP, Yangze.
> > > > > >
> > > > > > I agree with Till that we probably want to separate the K8s/Yarn
> > > > > decorator
> > > > > > calls. Users can still configure one driver class, and we can use
> > > > > > `instanceof` to check whether the driver implemented K8s/Yarn
> > > specific
> > > > > > interfaces.
> > > > > >
> > > > > > Moreover, I'm not sure about exposing entire `ContainerRequest` /
> > > `Pod`
> > > > > > (`AbstractKubernetesStepDecorator` directly manipulates on `Pod`)
> > to
> > > user
> > > > > > codes. It gives more access to user codes than needed for defining
> > > > > external
> > > > > > resource, which might cause problems. Instead, I would suggest to
> > > have
> > > > > > interface like `Map<String key, String value>
> > > > > > getYarn/KubernetesExternalResource()` and assemble them into
> > > > > > `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <
> > [hidden email]>
> > > > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'm a bit late to the party. I think the current proposal looks
> > > good.
> > > > > > >
> > > > > > > Concerning the ExternalResourceDriver interface defined in the
> > FLIP
> > > > > [1], I
> > > > > > > would suggest to not include the decorator calls for Kubernetes
> > and
> > > > > Yarn in
> > > > > > > the base interface. Instead I would suggest to segregate the
> > > deployment
> > > > > > > specific decorator calls into separate interfaces. That way an
> > > > > > > ExternalResourceDriver does not have to support all deployments
> > > from
> > > > > the
> > > > > > > very beginning. Moreover, some resources might not be supported
> > by
> > > a
> > > > > > > specific deployment target and the natural way to express this
> > > would
> > > > > be to
> > > > > > > not implement the respective deployment specific interface.
> > > > > > >
> > > > > > > Moreover, having void
> > > > > > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> > > > > containerRequest)
> > > > > > > in the ExternalResourceDriver interface would require Hadoop on
> > > Flink's
> > > > > > > classpath whenever the external resource driver is being used.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > > >
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <[hidden email]>
> > > > > wrote:
> > > > > > >
> > > > > > > > Nice, thanks a lot!
> > > > > > > >
> > > > > > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <
> > [hidden email]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the suggestion, @Stephan, @Becket and @Xintong.
> > > > > > > > >
> > > > > > > > > I've updated the FLIP accordingly. I do not add a
> > > > > > > > > ResourceInfoProvider. Instead, I introduce the
> > > > > ExternalResourceDriver,
> > > > > > > > > which takes the responsibility of all relevant operations on
> > > both
> > > > > RM
> > > > > > > > > and TM sides.
> > > > > > > > > After a rethink about decoupling the management of external
> > > > > resources
> > > > > > > > > from TaskExecutor, I think we could do the same thing on the
> > > > > > > > > ResourceManager side. We do not need to add a specific
> > > allocation
> > > > > > > > > logic to the ResourceManager each time we add a specific
> > > external
> > > > > > > > > resource.
> > > > > > > > > - For Yarn, we need the ExternalResourceDriver to edit the
> > > > > > > > > containerRequest.
> > > > > > > > > - For Kubenetes, ExternalResourceDriver could provide a
> > > decorator
> > > > > for
> > > > > > > > > the TM pod.
> > > > > > > > >
> > > > > > > > > In this way, just like MetricReporter, we allow users to
> > define
> > > > > their
> > > > > > > > > custom ExternalResourceDriver. It is more extensible and fits
> > > the
> > > > > > > > > separation of concerns. For more details, please take a look
> > at
> > > > > [1].
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <
> > [hidden email]
> > > >
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > This sounds good to go ahead from my side.
> > > > > > > > > >
> > > > > > > > > > I like the approach that Becket suggested - in that case
> > the
> > > core
> > > > > > > > > > abstraction that everyone would need to understand would be
> > > > > "external
> > > > > > > > > > resource allocation" and the "ResourceInfoProvider", and
> > the
> > > GPU
> > > > > > > > specific
> > > > > > > > > > code would be a specific implementation only known to that
> > > > > component
> > > > > > > > that
> > > > > > > > > > allocates the external resource. That fits the separation
> > of
> > > > > concerns
> > > > > > > > > well.
> > > > > > > > > >
> > > > > > > > > > I also understand that it should not be over-engineered in
> > > the
> > > > > first
> > > > > > > > > > version, so some simplification makes sense, and then
> > > gradually
> > > > > > > expand
> > > > > > > > > from
> > > > > > > > > > there.
> > > > > > > > > >
> > > > > > > > > > So +1 to go ahead with what was suggested above (Xintong /
> > > > > Becket)
> > > > > > > from
> > > > > > > > > my
> > > > > > > > > > side.
> > > > > > > > > >
> > > > > > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > > > > > >
> > > > > > > > > > > @Stephan
> > > > > > > > > > >
> > > > > > > > > > > I see your concern, and I completely agree with you that
> > we
> > > > > should
> > > > > > > > > first
> > > > > > > > > > > think about the "library" / "plugin" / "extension" style
> > if
> > > > > > > possible.
> > > > > > > > > > >
> > > > > > > > > > > If GPUs are sliced and assigned during scheduling, there
> > > may be
> > > > > > > > reason,
> > > > > > > > > > > > although it looks that it would belong to the slot
> > then.
> > > Is
> > > > > that
> > > > > > > > > what we
> > > > > > > > > > > > are doing here?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > In the current proposal, we do not have the GPUs sliced
> > and
> > > > > > > assigned
> > > > > > > > to
> > > > > > > > > > > slots, because it could be problematic without dynamic
> > slot
> > > > > > > > allocation.
> > > > > > > > > > > E.g., the number of GPUs might not be evenly divisible by
> > > the
> > > > > > > number
> > > > > > > > of
> > > > > > > > > > > slots.
> > > > > > > > > > >
> > > > > > > > > > > I think it makes sense to eventually have the GPUs
> > > assigned to
> > > > > > > slots.
> > > > > > > > > Even
> > > > > > > > > > > then, we might still need a TM level GPUManager (or
> > > > > > > ResourceProvider
> > > > > > > > > like
> > > > > > > > > > > Becket suggested). For memory, in each slot we can simply
> > > > > request
> > > > > > > the
> > > > > > > > > > > amount of memory, leaving it to JVM / OS to decide which
> > > memory
> > > > > > > > > (address)
> > > > > > > > > > > should be assigned. For GPU, and potentially other
> > > resources
> > > > > like
> > > > > > > > > FPGA, we
> > > > > > > > > > > need to explicitly specify which GPU (index) should be
> > > used.
> > > > > > > > > Therefore, we
> > > > > > > > > > > need some component at the TM level to coordinate which
> > > slot
> > > > > uses
> > > > > > > > which
> > > > > > > > > > > GPU.
> > > > > > > > > > >
> > > > > > > > > > > IMO, unless we say Flink will not support slot-level GPU
> > > > > slicing at
> > > > > > > > > least
> > > > > > > > > > > in the foreseeable future, I don't see a good way to
> > avoid
> > > > > touching
> > > > > > > > > the TM
> > > > > > > > > > > core. To that end, I think Becket's suggestion points to
> > a
> > > good
> > > > > > > > > direction,
> > > > > > > > > > > that supports more features (GPU, FPGA, etc.) with less
> > > > > coupling to
> > > > > > > > > the TM
> > > > > > > > > > > core (only needs to understand the general interfaces).
> > The
> > > > > > > detailed
> > > > > > > > > > > implementation for specific resource types can even be
> > > > > encapsulated
> > > > > > > > as
> > > > > > > > > a
> > > > > > > > > > > library.
> > > > > > > > > > >
> > > > > > > > > > > @Becket
> > > > > > > > > > >
> > > > > > > > > > > Thanks for sharing your thought on the final state.
> > > Despite the
> > > > > > > > > details how
> > > > > > > > > > > the interfaces should look like, I think this is a really
> > > good
> > > > > > > > > abstraction
> > > > > > > > > > > for supporting general resource types.
> > > > > > > > > > >
> > > > > > > > > > > I'd like to further clarify that, the following three
> > > things
> > > > > are
> > > > > > > all
> > > > > > > > > that
> > > > > > > > > > > the "Flink core" needs to understand.
> > > > > > > > > > >
> > > > > > > > > > > - The *amount* of resource, for scheduling. Actually,
> > we
> > > > > already
> > > > > > > > > have
> > > > > > > > > > > the Resource class in ResourceProfile and ResourceSpec
> > > for
> > > > > > > > extended
> > > > > > > > > > > resource. It's just not really used.
> > > > > > > > > > > - The *info*, that Flink provides to the operators /
> > > user
> > > > > codes.
> > > > > > > > > > > - The *provider*, which generates the info based on
> > the
> > > > > amount.
> > > > > > > > > > >
> > > > > > > > > > > The "core" does not need to understand the specific
> > > > > implementation
> > > > > > > > > details
> > > > > > > > > > > of the above three. They can even be implemented in a
> > > 3rd-party
> > > > > > > > > library.
> > > > > > > > > > > Similar to how we allow users to define their custom
> > > > > > > MetricReporter.
> > > > > > > > > > >
> > > > > > > > > > > Thank you~
> > > > > > > > > > >
> > > > > > > > > > > Xintong Song
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for the comment, Stephan.
> > > > > > > > > > > >
> > > > > > > > > > > > - If everything becomes a "core feature", it will
> > make
> > > the
> > > > > > > > project
> > > > > > > > > hard
> > > > > > > > > > > > > to develop in the future. Thinking "library" /
> > > "plugin" /
> > > > > > > > > "extension"
> > > > > > > > > > > > style
> > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Completely agree. It is much more important to design a
> > > > > mechanism
> > > > > > > > > than
> > > > > > > > > > > > focusing on a specific case. Here is what I am thinking
> > > to
> > > > > fully
> > > > > > > > > support
> > > > > > > > > > > > custom resource management:
> > > > > > > > > > > > 1. On the JM / RM side, use ResourceProfile and
> > > ResourceSpec
> > > > > to
> > > > > > > > > define
> > > > > > > > > > > the
> > > > > > > > > > > > resource and the amount required. They will be used to
> > > find
> > > > > > > > suitable
> > > > > > > > > TMs
> > > > > > > > > > > > slots to run the tasks. At this point, the resources
> > are
> > > only
> > > > > > > > > measured by
> > > > > > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > > > > > >
> > > > > > > > > > > > 2. On the TM side, have something like
> > > > > *"ResourceInfoProvider"*
> > > > > > > to
> > > > > > > > > > > identify
> > > > > > > > > > > > and provides the detail information of the individual
> > > > > resource,
> > > > > > > > e.g.
> > > > > > > > > GPU
> > > > > > > > > > > > ID.. It is important because the operator may have to
> > > > > explicitly
> > > > > > > > > interact
> > > > > > > > > > > > with the physical resource it uses. The
> > > ResourceInfoProvider
> > > > > > > might
> > > > > > > > > look
> > > > > > > > > > > > like something below.
> > > > > > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > > > > > Map<AbstractID, INFO>
> > retrieveResourceInfo(OperatorId
> > > > > opId,
> > > > > > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > - There could be several "*ResourceInfoProvider*"
> > > configured
> > > > > on
> > > > > > > the
> > > > > > > > > TM to
> > > > > > > > > > > > retrieve the information for different resources.
> > > > > > > > > > > > - The TM will be responsible to assign those individual
> > > > > resources
> > > > > > > > to
> > > > > > > > > each
> > > > > > > > > > > > operator according to their requested amount.
> > > > > > > > > > > > - The operators will be able to get the ResourceInfo
> > from
> > > > > their
> > > > > > > > > > > > RuntimeContext.
> > > > > > > > > > > >
> > > > > > > > > > > > If we agree this is a reasonable final state. We can
> > > adapt
> > > > > the
> > > > > > > > > current
> > > > > > > > > > > FLIP
> > > > > > > > > > > > to it. In fact it does not sound a big change to me.
> > All
> > > the
> > > > > > > > proposed
> > > > > > > > > > > > configuration can be as is, it is just that Flink
> > itself
> > > > > won't
> > > > > > > care
> > > > > > > > > about
> > > > > > > > > > > > them, instead a GPUInfoProviver implementing the
> > > > > > > > ResourceInfoProvider
> > > > > > > > > > > will
> > > > > > > > > > > > use them.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi all!
> > > > > > > > > > > > >
> > > > > > > > > > > > > The main point I wanted to throw into the discussion
> > > is the
> > > > > > > > > following:
> > > > > > > > > > > > > - With more and more use cases, more and more tools
> > > go
> > > > > into
> > > > > > > > Flink
> > > > > > > > > > > > > - If everything becomes a "core feature", it will
> > > make
> > > > > the
> > > > > > > > > project
> > > > > > > > > > > hard
> > > > > > > > > > > > > to develop in the future. Thinking "library" /
> > > "plugin" /
> > > > > > > > > "extension"
> > > > > > > > > > > > style
> > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - A good thought experiment is always: How many
> > > future
> > > > > > > > developers
> > > > > > > > > > > have
> > > > > > > > > > > > to
> > > > > > > > > > > > > interact with this code (and possibly understand it
> > > > > partially),
> > > > > > > > > even if
> > > > > > > > > > > > the
> > > > > > > > > > > > > features they touch have nothing to do with GPU
> > > support. If
> > > > > > > many
> > > > > > > > > > > > > contributors to unrelated features will have to touch
> > > it
> > > > > and
> > > > > > > > > understand
> > > > > > > > > > > > it,
> > > > > > > > > > > > > then let's think if there is a different solution.
> > > Maybe
> > > > > there
> > > > > > > is
> > > > > > > > > not,
> > > > > > > > > > > > but
> > > > > > > > > > > > > then we should be sure why.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - That led me to raising this issue: If the GPU
> > > manager
> > > > > > > > becomes a
> > > > > > > > > > > core
> > > > > > > > > > > > > service in the TaskManager, Environment,
> > > RuntimeContext,
> > > > > etc.
> > > > > > > > then
> > > > > > > > > > > > everyone
> > > > > > > > > > > > > developing TM and streaming tasks need to understand
> > > the
> > > > > GPU
> > > > > > > > > manager.
> > > > > > > > > > > > That
> > > > > > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Access to configuration seems not the right reason to
> > > do
> > > > > that.
> > > > > > > We
> > > > > > > > > > > should
> > > > > > > > > > > > > expose the Flink configuration from the
> > RuntimeContext
> > > > > anyways.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If GPUs are sliced and assigned during scheduling,
> > > there
> > > > > may be
> > > > > > > > > reason,
> > > > > > > > > > > > > although it looks that it would belong to the slot
> > > then. Is
> > > > > > > that
> > > > > > > > > what
> > > > > > > > > > > we
> > > > > > > > > > > > > are doing here?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Stephan
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IMO, eventually an operator should only see info of
> > > GPUs
> > > > > that
> > > > > > > > are
> > > > > > > > > > > > > dedicated
> > > > > > > > > > > > > > for it, instead of all GPUs on the
> > machine/container
> > > in
> > > > > the
> > > > > > > > > current
> > > > > > > > > > > > > design.
> > > > > > > > > > > > > > It does not make sense to let the user who writes a
> > > UDF
> > > > > to
> > > > > > > > worry
> > > > > > > > > > > about
> > > > > > > > > > > > > > coordination among multiple operators running on
> > the
> > > same
> > > > > > > > > machine.
> > > > > > > > > > > And
> > > > > > > > > > > > if
> > > > > > > > > > > > > > we want to limit the GPU info an operator sees, we
> > > > > should not
> > > > > > > > > let the
> > > > > > > > > > > > > > operator to instantiate GPUManager, which means we
> > > have
> > > > > to
> > > > > > > > expose
> > > > > > > > > > > > > something
> > > > > > > > > > > > > > through runtime context, either GPU info or some
> > > kind of
> > > > > > > > limited
> > > > > > > > > > > access
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > the GPUManager.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > It probably make sense for us to first agree on
> > the
> > > > > final
> > > > > > > > > state.
> > > > > > > > > > > More
> > > > > > > > > > > > > > > specifically, will the resource info be exposed
> > > through
> > > > > > > > runtime
> > > > > > > > > > > > context
> > > > > > > > > > > > > > > eventually?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If that is the final state and we have a seamless
> > > > > migration
> > > > > > > > > story
> > > > > > > > > > > > from
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > FLIP to that final state, Personally I think it
> > is
> > > OK
> > > > > to
> > > > > > > > > expose the
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > info in the runtime context.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong Song <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > > > > > I think what Stephan means (@Stephan, please
> > > correct
> > > > > me
> > > > > > > if
> > > > > > > > > I'm
> > > > > > > > > > > > wrong)
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > that, we might not need to hold and maintain
> > the
> > > > > > > GPUManager
> > > > > > > > > as a
> > > > > > > > > > > > > > service
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > TaskManagerServices or RuntimeContext. An
> > > > > alternative is
> > > > > > > to
> > > > > > > > > > > create
> > > > > > > > > > > > /
> > > > > > > > > > > > > > > > retrieve the GPUManager only in the operators
> > > that
> > > > > need
> > > > > > > it,
> > > > > > > > > e.g.,
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > > > > > I agree with you on excluding GPUManager from
> > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > - For the first step, where we provide
> > unified
> > > > > > > TM-level
> > > > > > > > > GPU
> > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > to all operators, it should be fine to have
> > > > > operators
> > > > > > > > > access /
> > > > > > > > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > > > > > > > - In future, we might have some more
> > > fine-grained
> > > > > GPU
> > > > > > > > > > > > management,
> > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > we need to maintain GPUManager as a service
> > > and
> > > > > put
> > > > > > > GPU
> > > > > > > > > info
> > > > > > > > > > > in
> > > > > > > > > > > > > slot
> > > > > > > > > > > > > > > > profiles. But at least for now it's not
> > > necessary
> > > > > to
> > > > > > > > > introduce
> > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > complexity.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > However, I have some concerns on excluding
> > > GPUManager
> > > > > > > from
> > > > > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > - Configurations needed for creating the
> > > > > GPUManager is
> > > > > > > > not
> > > > > > > > > > > > always
> > > > > > > > > > > > > > > > available for operators.
> > > > > > > > > > > > > > > > - If later we want to have fine-grained
> > > control
> > > > > over
> > > > > > > GPU
> > > > > > > > > > > (e.g.,
> > > > > > > > > > > > > > > > operators in each slot can only see GPUs
> > > reserved
> > > > > for
> > > > > > > > that
> > > > > > > > > > > > slot),
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I would suggest to wrap the GPUManager behind
> > > > > > > > RuntimeContext
> > > > > > > > > and
> > > > > > > > > > > > only
> > > > > > > > > > > > > > > > expose the GPUInfo to users. For now, we can
> > > declare
> > > > > a
> > > > > > > > method
> > > > > > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a
> > default
> > > > > > > definition
> > > > > > > > > that
> > > > > > > > > > > > > calls
> > > > > > > > > > > > > > > > `GPUManager.get()` to get the lazily-created
> > > > > GPUManager.
> > > > > > > If
> > > > > > > > > later
> > > > > > > > > > > > we
> > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > to create / retrieve GPUManager in a different
> > > way,
> > > > > we
> > > > > > > can
> > > > > > > > > simply
> > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > how `getGPUInfo` is implemented, without
> > needing
> > > to
> > > > > > > change
> > > > > > > > > any
> > > > > > > > > > > > public
> > > > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze Guo <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > > > > > Do you mean Minicluster? Yes, it makes sense
> > to
> > > > > share
> > > > > > > the
> > > > > > > > > GPU
> > > > > > > > > > > > > Manager
> > > > > > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > > > > > If that's what you worry about, I'm +1 for
> > > holding
> > > > > > > > > > > > > > > > > GPUManager(ExternalResourceManagers) in
> > > > > TaskExecutor
> > > > > > > > > instead of
> > > > > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Regarding the RuntimeContext/FunctionContext,
> > > it
> > > > > just
> > > > > > > > > holds the
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > info instead of the GPU Manager. AFAIK, it's
> > > the
> > > > > only
> > > > > > > > > place we
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > pass GPU info to the
> > > > > RichFunction/UserDefinedFunction.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac
> > Godfried
> > > <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > > > > > > > [hidden email]
> > > > > > > > > > > wrote
> > > > > > > > > > > > > > ----
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Can we somehow keep this out of the
> > > > > TaskManager
> > > > > > > > > services
> > > > > > > > > > > > > > > > > > > I fear that we could not. IMO, the
> > > > > GPUManager(or
> > > > > > > > > > > > > > > > > > > ExternalServicesManagers in future) is
> > > > > conceptually
> > > > > > > > > one of
> > > > > > > > > > > > the
> > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > manager services, just like MemoryManager
> > > > > before
> > > > > > > > 1.10.
> > > > > > > > > > > > > > > > > > > - It maintains/holds the GPU resource at
> > TM
> > > > > level
> > > > > > > and
> > > > > > > > > all
> > > > > > > > > > > of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > operators allocate the GPU resources from
> > > it.
> > > > > So,
> > > > > > > it
> > > > > > > > > should
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > > > > > > > - We could add a collection called
> > > > > > > > > ExternalResourceManagers
> > > > > > > > > > > > to
> > > > > > > > > > > > > > hold
> > > > > > > > > > > > > > > > > > > all managers of other external resources
> > > in the
> > > > > > > > future.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Can you help me understand why this needs
> > the
> > > > > > > addition
> > > > > > > > in
> > > > > > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > > > > > Are you worried about the case when
> > multiple
> > > Task
> > > > > > > > > Executors
> > > > > > > > > > > run
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > JVM? That's not common, but wouldn't it
> > > actually
> > > > > be
> > > > > > > > good
> > > > > > > > > in
> > > > > > > > > > > > that
> > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > share the GPU Manager, given that the GPU
> > is
> > > > > shared?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > What parts need information about this?
> > > > > > > > > > > > > > > > > > > In this FLIP, operators need the
> > > information.
> > > > > Thus,
> > > > > > > > we
> > > > > > > > > > > expose
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > information to the
> > > > > RuntimeContext/FunctionContext.
> > > > > > > > The
> > > > > > > > > slot
> > > > > > > > > > > > > > profile
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > not aware of GPU resources as GPU is TM
> > > level
> > > > > > > > resource
> > > > > > > > > now.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Can the GPU Manager be a "self
> > contained"
> > > > > thing
> > > > > > > > that
> > > > > > > > > > > simply
> > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > configuration, and then abstracts
> > > everything
> > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > > Yes, we just pass the path/args of the
> > > discover
> > > > > > > > script
> > > > > > > > > and
> > > > > > > > > > > > how
> > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > GPUs per TM to it. It takes the
> > > responsibility
> > > > > to
> > > > > > > get
> > > > > > > > > the
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > information and expose them to the
> > > > > > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > Operators. Meanwhile, we'd better not
> > allow
> > > > > > > operators
> > > > > > > > > to
> > > > > > > > > > > > > directly
> > > > > > > > > > > > > > > > > > > access GPUManager, it should get what
> > they
> > > want
> > > > > > > from
> > > > > > > > > > > Context.
> > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > then decouple the
> > interface/implementation
> > > of
> > > > > > > > > GPUManager
> > > > > > > > > > > and
> > > > > > > > > > > > > > Public
> > > > > > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM Stephan
> > > Ewen <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > It sounds fine to initially start with
> > > GPU
> > > > > > > specific
> > > > > > > > > > > support
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > generalizing this once we better
> > > understand
> > > > > the
> > > > > > > > > space.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > About the implementation suggested in
> > > > > FLIP-108:
> > > > > > > > > > > > > > > > > > > > - Can we somehow keep this out of the
> > > > > TaskManager
> > > > > > > > > > > services?
> > > > > > > > > > > > > > > > Anything
> > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > have to pull through all layers of the
> > TM
> > > > > makes
> > > > > > > the
> > > > > > > > > TM
> > > > > > > > > > > > > > components
> > > > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > - What parts need information about
> > this?
> > > > > > > > > > > > > > > > > > > > -> do the slot profiles need
> > information
> > > > > about
> > > > > > > the
> > > > > > > > > GPU?
> > > > > > > > > > > > > > > > > > > > -> Can the GPU Manager be a "self
> > > contained"
> > > > > > > thing
> > > > > > > > > that
> > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > > > the configuration, and then abstracts
> > > > > everything
> > > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > access it via "GPUManager.get()" or so?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM Yangze
> > > Guo <
> > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo,
> > you're
> > > > > right,
> > > > > > > > > I'll add
> > > > > > > > > > > > > them
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > > > > > Regarding the general extended
> > resource
> > > > > > > > mechanism,
> > > > > > > > > I
> > > > > > > > > > > > second
> > > > > > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > > > > - It's better to leverage
> > > ResourceProfile
> > > > > and
> > > > > > > > > > > > ResourceSpec
> > > > > > > > > > > > > > > after
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > supporting fine-grained GPU
> > > scheduling. As
> > > > > a
> > > > > > > > first
> > > > > > > > > step
> > > > > > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > > > > > prefer to not include it in the scope
> > > of
> > > > > this
> > > > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > - Regarding the "Extended Resource
> > > > > Manager",
> > > > > > > if I
> > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > > > > > > correctly, it just a code refactoring
> > > atm,
> > > > > we
> > > > > > > > could
> > > > > > > > > > > > extract
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > open/close/allocateExtendResources of
> > > > > > > GPUManager
> > > > > > > > to
> > > > > > > > > > > that
> > > > > > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > > > > > that is the case, +1 to do it during
> > > > > > > > > implementation.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > > > > > As Xintong said, we looked into how
> > > Spark
> > > > > > > > supports
> > > > > > > > > a
> > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > Resource Scheduling" before and
> > > decided to
> > > > > > > > > introduce a
> > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > > > > > to make it more extensible. I think
> > the
> > > > > > > > "resource"
> > > > > > > > > is a
> > > > > > > > > > > > > > proper
> > > > > > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > > > > > to contain all the configs of
> > extended
> > > > > > > resources.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM
> > Xingbo
> > > > > Huang <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP, Yangze.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > There is no doubt that GPU resource
> > > > > > > management
> > > > > > > > > > > support
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > > > > > facilitate the development of
> > > AI-related
> > > > > > > > > applications
> > > > > > > > > > > > by
> > > > > > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I have only one comment about this
> > > wiki:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Regarding the names of several GPU
> > > > > > > > > configurations, I
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > delete the resource field makes it
> > > > > consistent
> > > > > > > > > with
> > > > > > > > > > > the
> > > > > > > > > > > > > > names
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > resource-related configurations in
> > > > > > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > > > > > ->
> > > > > > > > > > > > > > > > > > > > > >
> > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Xintong Song <
> > [hidden email]>
> > > > > > > > > 于2020年3月4日周三
> > > > > > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I also
> > > had
> > > > > an
> > > > > > > > > offline
> > > > > > > > > > > > > > discussion
> > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > > > > > the "GPU Support" as some general
> > > > > "Extended
> > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > Support".
> > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > > > > > supporting extended resources in
> > a
> > > > > general
> > > > > > > > > > > mechanism
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > > > > and extensible way. The reason we
> > > > > propose
> > > > > > > > this
> > > > > > > > > FLIP
> > > > > > > > > > > > > > > narrowing
> > > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > > > > > down to GPU alone, is mainly for
> > > the
> > > > > > > concern
> > > > > > > > on
> > > > > > > > > > > extra
> > > > > > > > > > > > > > > efforts
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > > > > > capacity needed for a general
> > > > > mechanism.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > To come up with a well design on
> > a
> > > > > general
> > > > > > > > > extended
> > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > > mechanism, we would need to
> > > investigate
> > > > > > > more
> > > > > > > > > on how
> > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > > > > > kind of resources in practice.
> > For
> > > > > GPU, we
> > > > > > > > > learnt
> > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > experts, Becket and his team
> > > members.
> > > > > But
> > > > > > > for
> > > > > > > > > FPGA,
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > > > > > extended resources, we don't have
> > > such
> > > > > > > > > convenient
> > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > > > > > making the investigation requires
> > > more
> > > > > > > > efforts,
> > > > > > > > > > > > which I
> > > > > > > > > > > > > > > tend
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On the other hand, we also looked
> > > into
> > > > > how
> > > > > > > > > Spark
> > > > > > > > > > > > > > supports a
> > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > > Resource Scheduling". Assuming we
> > > want
> > > > > to
> > > > > > > > have
> > > > > > > > > a
> > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > resource mechanism in the future,
> > > we
> > > > > > > believe
> > > > > > > > > that
> > > > > > > > > > > the
> > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > > design can be easily extended, in
> > > an
> > > > > > > > > incremental
> > > > > > > > > > > way
> > > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > too
> > > > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > - The most important part is
> > > probably
> > > > > user
> > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > > > > > configuration options to define
> > the
> > > > > amount,
> > > > > > > > > > > discovery
> > > > > > > > > > > > > > > script
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > > > > > k8s) in a per resource type bias
> > > [1],
> > > > > which
> > > > > > > > is
> > > > > > > > > very
> > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > proposed in this FLIP. I think
> > > it's not
> > > > > > > > > necessary
> > > > > > > > > > > to
> > > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > > in the general way atm, since we
> > > do not
> > > > > > > have
> > > > > > > > > > > supports
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > types now. If later we decided to
> > > have
> > > > > per
> > > > > > > > > resource
> > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > > > > > can have backwards compatibility
> > > on the
> > > > > > > > current
> > > > > > > > > > > > > proposed
> > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > > > > > - For the GPU Manager, if later
> > > needed
> > > > > we
> > > > > > > can
> > > > > > > > > > > change
> > > > > > > > > > > > it
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > > > > > Resource Manager" (or whatever it
> > > is
> > > > > > > called).
> > > > > > > > > That
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > > > > > component-internal refactoring.
> > > > > > > > > > > > > > > > > > > > > > > - For ResourceProfile and
> > > ResourceSpec,
> > > > > > > there
> > > > > > > > > are
> > > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > > > > > general extended resource. We can
> > > of
> > > > > course
> > > > > > > > > > > leverage
> > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > > > > > fine grained GPU scheduling. That
> > > is
> > > > > also
> > > > > > > not
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > > > > step proposal, and would require
> > > > > FLIP-56 to
> > > > > > > > be
> > > > > > > > > > > > finished
> > > > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > To summary up, I agree with
> > Becket
> > > that
> > > > > > > have
> > > > > > > > a
> > > > > > > > > > > > separate
> > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > general extended resource
> > > mechanism,
> > > > > and
> > > > > > > keep
> > > > > > > > > it in
> > > > > > > > > > > > > mind
> > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > > > > > and implementing the current one.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> > https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18 AM
> > > Becket
> > > > > Qin <
> > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > That's a good point, Stephan.
> > It
> > > > > makes
> > > > > > > > total
> > > > > > > > > > > sense
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > resource management to support
> > > custom
> > > > > > > > > resources.
> > > > > > > > > > > > > Having
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > > > > to add new resources by
> > > themselves.
> > > > > The
> > > > > > > > > general
> > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > involve two different aspects:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > 1. The custom resource type
> > > > > definition.
> > > > > > > It
> > > > > > > > is
> > > > > > > > > > > > > supported
> > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > resources in ResourceProfile
> > and
> > > > > > > > > ResourceSpec.
> > > > > > > > > > > This
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > 2. The custom resource
> > allocation
> > > > > logic,
> > > > > > > > > i.e. how
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > assign
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > > > > > to different tasks, operators,
> > > and
> > > > > so on.
> > > > > > > > > This
> > > > > > > > > > > may
> > > > > > > > > > > > > > > require
> > > > > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > > > > > a. Subtask level - make sure
> > the
> > > > > subtasks
> > > > > > > > > are put
> > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > > > > > It is done by the global RM and
> > > is
> > > > > not
> > > > > > > > > > > customizable
> > > > > > > > > > > > > > right
> > > > > > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > > > > b. Operator level - map the
> > exact
> > > > > > > resource
> > > > > > > > > to the
> > > > > > > > > > > > > > > operators
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU 2 for
> > > > > operator
> > > > > > > B.
> > > > > > > > > This
> > > > > > > > > > > > step
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > > > > > the global RM does not
> > > distinguish
> > > > > > > > individual
> > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > > > > > It is true for memory, but not
> > > for
> > > > > GPU.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > The GPU manager is designed to
> > > do 2.b
> > > > > > > here.
> > > > > > > > > So it
> > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > > > > > physical GPU information and
> > > > > bind/match
> > > > > > > > them
> > > > > > > > > to
> > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > > general will fill in the
> > missing
> > > > > piece to
> > > > > > > > > support
> > > > > > > > > > > > > > custom
> > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > > > > definition. But I'd avoid
> > > calling it
> > > > > a
> > > > > > > > > "External
> > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > > > > > confusion with RM, maybe
> > > something
> > > > > like
> > > > > > > > > "Operator
> > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > > > > be more accurate. So for each
> > > > > resource
> > > > > > > type
> > > > > > > > > users
> > > > > > > > > > > > can
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > > > > > "Operator Resource Assigner" in
> > > the
> > > > > TM.
> > > > > > > For
> > > > > > > > > > > memory,
> > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > > > > > but for other extended
> > resources,
> > > > > users
> > > > > > > may
> > > > > > > > > need
> > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Personally I think a pluggable
> > > > > "Operator
> > > > > > > > > Resource
> > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > > > > > in this FLIP. But I am also OK
> > > with
> > > > > > > having
> > > > > > > > > that
> > > > > > > > > > > in
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > > > > > the interface between the
> > > "Operator
> > > > > > > > Resource
> > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > take a while to settle down if
> > we
> > > > > want to
> > > > > > > > > make it
> > > > > > > > > > > > > > > generic.
> > > > > > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > > > > > implementation should take this
> > > > > future
> > > > > > > work
> > > > > > > > > into
> > > > > > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > > > > > don't need to break backwards
> > > > > > > compatibility
> > > > > > > > > once
> > > > > > > > > > > we
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 12:27 AM
> > > > > Stephan
> > > > > > > > Ewen
> > > > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thank you for writing this
> > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > I cannot really give much
> > input
> > > > > into
> > > > > > > the
> > > > > > > > > > > > mechanics
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > > > > > and GPU allocation, as I have
> > > no
> > > > > > > > experience
> > > > > > > > > > > with
> > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > One thought I had when
> > reading
> > > the
> > > > > > > > > proposal is
> > > > > > > > > > > if
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as an
> > > "External
> > > > > > > > Resource
> > > > > > > > > > > > > Manager",
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > > > > > The way I understand the
> > > > > > > ResourceProfile
> > > > > > > > > and
> > > > > > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > > > > > It has the advantage that it
> > > looks
> > > > > more
> > > > > > > > > > > > extensible.
> > > > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > Resource, a specialized
> > NVIDIA
> > > GPU
> > > > > > > > > Resource,
> > > > > > > > > > > and
> > > > > > > > > > > > > FPGA
> > > > > > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at 7:57
> > AM
> > > > > Becket
> > > > > > > > Qin <
> > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP Yangze.
> > > GPU
> > > > > > > > resource
> > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > > > > > for machine learning use
> > > cases.
> > > > > > > > Actually
> > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > > one
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > > > > > question from the users who
> > > are
> > > > > > > > > interested in
> > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Some quick comments /
> > > questions
> > > > > to
> > > > > > > the
> > > > > > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API
> > > should
> > > > > > > probably
> > > > > > > > > also
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > > > > > 2. Is the data structure
> > that
> > > > > holds
> > > > > > > GPU
> > > > > > > > > info
> > > > > > > > > > > > > also a
> > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at
> > 10:15
> > > AM
> > > > > > > Xintong
> > > > > > > > > Song
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for drafting the
> > > FLIP
> > > > > and
> > > > > > > > > kicking
> > > > > > > > > > > off
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this feature.
> > > > > Supporting
> > > > > > > > > using
> > > > > > > > > > > of
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > > > > > especially for the ML
> > > > > scenarios.
> > > > > > > > > > > > > > > > > > > > > > > > > > > I've reviewed the FLIP
> > wiki
> > > > > doc and
> > > > > > > > it
> > > > > > > > > > > looks
> > > > > > > > > > > > > good
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > > > > > very good first step for
> > > > > Flink's
> > > > > > > GPU
> > > > > > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020 at
> > > 12:06 PM
> > > > > > > > Yangze
> > > > > > > > > Guo
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to start
> > a
> > > > > > > discussion
> > > > > > > > > > > thread
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > > support in Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly
> > > discusses
> > > > > the
> > > > > > > > > following
> > > > > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > - Enable user to
> > > configure
> > > > > how
> > > > > > > many
> > > > > > > > > GPUs
> > > > > > > > > > > > in a
> > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > > > > forward such
> > > requirements to
> > > > > the
> > > > > > > > > external
> > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Kubernetes/Yarn/Mesos
> > > > > setups).
> > > > > > > > > > > > > > > > > > > > > > > > > > > > - Provide information
> > of
> > > > > > > available
> > > > > > > > > GPU
> > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes proposed in
> > > the
> > > > > FLIP
> > > > > > > > are
> > > > > > > > > as
> > > > > > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU resource
> > > > > > > requirements
> > > > > > > > > to
> > > > > > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce GPUManager
> > as
> > > > > one of
> > > > > > > > the
> > > > > > > > > task
> > > > > > > > > > > > > > manager
> > > > > > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > > > > > and expose GPU resource
> > > > > > > information
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce the default
> > > > > script
> > > > > > > for
> > > > > > > > > GPU
> > > > > > > > > > > > > > discovery,
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > > > > > the privilege mode to
> > > help
> > > > > user
> > > > > > > to
> > > > > > > > > > > achieve
> > > > > > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more
> > details
> > > in
> > > > > the
> > > > > > > > FLIP
> > > > > > > > > wiki
> > > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> >

Till Rohrmann

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

If there is no need for the ExternalResourceDriver on the RM side, then it
is always a good idea to keep it simple and don't introduce it. One can
always change things once one realizes that there is a need for it.

Cheers,
Till

On Mon, Mar 30, 2020 at 12:00 PM Yangze Guo <[hidden email]> wrote:

> Hi @Till, @Xintong
>
> I think even without the credential concerns, replacing the interfaces
> with configuration options is a good idea from my side.
> - Currently, I don't see any external resource does not compatible
> with this mechanism
> - It reduces the burden of users to implement a plugin themselves.
> WDYT?
>
> Best,
> Yangze Guo
>
> On Mon, Mar 30, 2020 at 5:44 PM Xintong Song <[hidden email]>
> wrote:
> >
> > I also agree that the pluggable ExternalResourceDriver should be loaded
> by
> > the cluster class loader. Despite the plugin might be implemented by
> users,
> > external resources (as part of task executor resources) should be cluster
> > configurations, unlike job-level user codes such as UDFs, because the
> task
> > executors belongs to the cluster rather than jobs.
> >
> >
> > IIUC, the concern Stephan raised is about the potential credential
> problem
> > when executing user codes on RM with cluster class loader. The concern
> > makes sense to me, and I think what Yangze suggested should be a good
> > approach trying to prevent such credential problems. The only purpose we
> > tried to execute user codes (i.e. getKubernetes/YarnExternalResource) on
> RM
> > was that, we need to set these key-value pairs to pod/container requests.
> > Replacing the interfaces getKubernetes/YarnExternalResource with
> > configuration options
> > 'external-resource.{resourceName}.yarn/kubernetes.key/amount',
> > we can still fulfill that purpose, without the credential risks.
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Mon, Mar 30, 2020 at 5:17 PM Till Rohrmann <[hidden email]>
> wrote:
> >
> > > At the moment the RM does not have a user code class loader and I agree
> > > with Stephan that it should stay like this. This, however, does not
> mean
> > > that we cannot support pluggable components in the RM. As long as the
> > > plugins are on the system's class path, it should be fine for the RM to
> > > load them. For example, we could add external resources via Flink's
> plugin
> > > mechanism or something similar.
> > >
> > > A very simple implementation of such an ExternalResourceDriver could
> be a
> > > class which simply returns what is written in the flink-conf.yaml
> under a
> > > given key.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Mar 30, 2020 at 5:39 AM Yangze Guo <[hidden email]> wrote:
> > >
> > > > Hi, Stephan,
> > > >
> > > > I see your concern and I totally agree with you.
> > > >
> > > > The interface on RM side is now `Map<String key, String/Long value>
> > > > getYarn/KubernetesExternalResource()`. The only valid information RM
> > > > get from it is the configuration key of that external resource in
> > > > Yarn/K8s. The "String/Long value" would be the same as the
> > > > external-resource.{resourceName}.amount.
> > > > So, I think it makes sense to replace these two interfaces with two
> > > > configs, i.e. external-resource.{resourceName}.yarn/kubernetes.key.
> We
> > > > may lose some extensibility, but AFAIK it could work with common
> > > > external resources like GPU, FPGA. WDYT?
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Fri, Mar 27, 2020 at 7:59 PM Stephan Ewen <[hidden email]>
> wrote:
> > > > >
> > > > > Maybe one final comment: It is probably not an issue, but let's
> try and
> > > > > keep user code (via user code classloader) out of the
> ResourceManager,
> > > if
> > > > > possible.
> > > > >
> > > > > As background:
> > > > >
> > > > > There were thoughts in the past to support setups where the RM
> must run
> > > > > with "superuser" credentials, but we cannot run JM/TM with these
> > > > > credentials, as the user code might access them otherwise.
> > > > > This is actually possible today, you can run the RM in a different
> JVM
> > > or
> > > > > in a different container, and give it more credentials than JMs /
> TMs.
> > > > But
> > > > > for this to be feasible, we cannot allow any user-defined code to
> be in
> > > > the
> > > > > JVM, because that instantaneously breaks the isolation of
> credentials.
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]>
> wrote:
> > > > >
> > > > > > Thanks for the feedback, @Till and @Xintong.
> > > > > >
> > > > > > Regarding separating the interface, I'm also +1 with it.
> > > > > >
> > > > > > Regarding the resource allocation interface, true, it's
> dangerous to
> > > > > > give much access to user codes. Changing the return type to
> > > Map<String
> > > > > > key, String/Long value> makes sense to me. AFAIK, it is
> compatible
> > > > > > with all the first-party supported resources for
> Yarn/Kubernetes. It
> > > > > > could also free us from the potential dependency issue as well.
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > > > On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <
> [hidden email]
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > Thanks for updating the FLIP, Yangze.
> > > > > > >
> > > > > > > I agree with Till that we probably want to separate the
> K8s/Yarn
> > > > > > decorator
> > > > > > > calls. Users can still configure one driver class, and we can
> use
> > > > > > > `instanceof` to check whether the driver implemented K8s/Yarn
> > > > specific
> > > > > > > interfaces.
> > > > > > >
> > > > > > > Moreover, I'm not sure about exposing entire
> `ContainerRequest` /
> > > > `Pod`
> > > > > > > (`AbstractKubernetesStepDecorator` directly manipulates on
> `Pod`)
> > > to
> > > > user
> > > > > > > codes. It gives more access to user codes than needed for
> defining
> > > > > > external
> > > > > > > resource, which might cause problems. Instead, I would suggest
> to
> > > > have
> > > > > > > interface like `Map<String key, String value>
> > > > > > > getYarn/KubernetesExternalResource()` and assemble them into
> > > > > > > `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <
> > > [hidden email]>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I'm a bit late to the party. I think the current proposal
> looks
> > > > good.
> > > > > > > >
> > > > > > > > Concerning the ExternalResourceDriver interface defined in
> the
> > > FLIP
> > > > > > [1], I
> > > > > > > > would suggest to not include the decorator calls for
> Kubernetes
> > > and
> > > > > > Yarn in
> > > > > > > > the base interface. Instead I would suggest to segregate the
> > > > deployment
> > > > > > > > specific decorator calls into separate interfaces. That way
> an
> > > > > > > > ExternalResourceDriver does not have to support all
> deployments
> > > > from
> > > > > > the
> > > > > > > > very beginning. Moreover, some resources might not be
> supported
> > > by
> > > > a
> > > > > > > > specific deployment target and the natural way to express
> this
> > > > would
> > > > > > be to
> > > > > > > > not implement the respective deployment specific interface.
> > > > > > > >
> > > > > > > > Moreover, having void
> > > > > > > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> > > > > > containerRequest)
> > > > > > > > in the ExternalResourceDriver interface would require Hadoop
> on
> > > > Flink's
> > > > > > > > classpath whenever the external resource driver is being
> used.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Till
> > > > > > > >
> > > > > > > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <
> [hidden email]>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Nice, thanks a lot!
> > > > > > > > >
> > > > > > > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <
> > > [hidden email]>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks for the suggestion, @Stephan, @Becket and
> @Xintong.
> > > > > > > > > >
> > > > > > > > > > I've updated the FLIP accordingly. I do not add a
> > > > > > > > > > ResourceInfoProvider. Instead, I introduce the
> > > > > > ExternalResourceDriver,
> > > > > > > > > > which takes the responsibility of all relevant
> operations on
> > > > both
> > > > > > RM
> > > > > > > > > > and TM sides.
> > > > > > > > > > After a rethink about decoupling the management of
> external
> > > > > > resources
> > > > > > > > > > from TaskExecutor, I think we could do the same thing on
> the
> > > > > > > > > > ResourceManager side. We do not need to add a specific
> > > > allocation
> > > > > > > > > > logic to the ResourceManager each time we add a specific
> > > > external
> > > > > > > > > > resource.
> > > > > > > > > > - For Yarn, we need the ExternalResourceDriver to edit
> the
> > > > > > > > > > containerRequest.
> > > > > > > > > > - For Kubenetes, ExternalResourceDriver could provide a
> > > > decorator
> > > > > > for
> > > > > > > > > > the TM pod.
> > > > > > > > > >
> > > > > > > > > > In this way, just like MetricReporter, we allow users to
> > > define
> > > > > > their
> > > > > > > > > > custom ExternalResourceDriver. It is more extensible and
> fits
> > > > the
> > > > > > > > > > separation of concerns. For more details, please take a
> look
> > > at
> > > > > > [1].
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <
> > > [hidden email]
> > > > >
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > This sounds good to go ahead from my side.
> > > > > > > > > > >
> > > > > > > > > > > I like the approach that Becket suggested - in that
> case
> > > the
> > > > core
> > > > > > > > > > > abstraction that everyone would need to understand
> would be
> > > > > > "external
> > > > > > > > > > > resource allocation" and the "ResourceInfoProvider",
> and
> > > the
> > > > GPU
> > > > > > > > > specific
> > > > > > > > > > > code would be a specific implementation only known to
> that
> > > > > > component
> > > > > > > > > that
> > > > > > > > > > > allocates the external resource. That fits the
> separation
> > > of
> > > > > > concerns
> > > > > > > > > > well.
> > > > > > > > > > >
> > > > > > > > > > > I also understand that it should not be
> over-engineered in
> > > > the
> > > > > > first
> > > > > > > > > > > version, so some simplification makes sense, and then
> > > > gradually
> > > > > > > > expand
> > > > > > > > > > from
> > > > > > > > > > > there.
> > > > > > > > > > >
> > > > > > > > > > > So +1 to go ahead with what was suggested above
> (Xintong /
> > > > > > Becket)
> > > > > > > > from
> > > > > > > > > > my
> > > > > > > > > > > side.
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > > > > > > >
> > > > > > > > > > > > @Stephan
> > > > > > > > > > > >
> > > > > > > > > > > > I see your concern, and I completely agree with you
> that
> > > we
> > > > > > should
> > > > > > > > > > first
> > > > > > > > > > > > think about the "library" / "plugin" / "extension"
> style
> > > if
> > > > > > > > possible.
> > > > > > > > > > > >
> > > > > > > > > > > > If GPUs are sliced and assigned during scheduling,
> there
> > > > may be
> > > > > > > > > reason,
> > > > > > > > > > > > > although it looks that it would belong to the slot
> > > then.
> > > > Is
> > > > > > that
> > > > > > > > > > what we
> > > > > > > > > > > > > are doing here?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > In the current proposal, we do not have the GPUs
> sliced
> > > and
> > > > > > > > assigned
> > > > > > > > > to
> > > > > > > > > > > > slots, because it could be problematic without
> dynamic
> > > slot
> > > > > > > > > allocation.
> > > > > > > > > > > > E.g., the number of GPUs might not be evenly
> divisible by
> > > > the
> > > > > > > > number
> > > > > > > > > of
> > > > > > > > > > > > slots.
> > > > > > > > > > > >
> > > > > > > > > > > > I think it makes sense to eventually have the GPUs
> > > > assigned to
> > > > > > > > slots.
> > > > > > > > > > Even
> > > > > > > > > > > > then, we might still need a TM level GPUManager (or
> > > > > > > > ResourceProvider
> > > > > > > > > > like
> > > > > > > > > > > > Becket suggested). For memory, in each slot we can
> simply
> > > > > > request
> > > > > > > > the
> > > > > > > > > > > > amount of memory, leaving it to JVM / OS to decide
> which
> > > > memory
> > > > > > > > > > (address)
> > > > > > > > > > > > should be assigned. For GPU, and potentially other
> > > > resources
> > > > > > like
> > > > > > > > > > FPGA, we
> > > > > > > > > > > > need to explicitly specify which GPU (index) should
> be
> > > > used.
> > > > > > > > > > Therefore, we
> > > > > > > > > > > > need some component at the TM level to coordinate
> which
> > > > slot
> > > > > > uses
> > > > > > > > > which
> > > > > > > > > > > > GPU.
> > > > > > > > > > > >
> > > > > > > > > > > > IMO, unless we say Flink will not support slot-level
> GPU
> > > > > > slicing at
> > > > > > > > > > least
> > > > > > > > > > > > in the foreseeable future, I don't see a good way to
> > > avoid
> > > > > > touching
> > > > > > > > > > the TM
> > > > > > > > > > > > core. To that end, I think Becket's suggestion
> points to
> > > a
> > > > good
> > > > > > > > > > direction,
> > > > > > > > > > > > that supports more features (GPU, FPGA, etc.) with
> less
> > > > > > coupling to
> > > > > > > > > > the TM
> > > > > > > > > > > > core (only needs to understand the general
> interfaces).
> > > The
> > > > > > > > detailed
> > > > > > > > > > > > implementation for specific resource types can even
> be
> > > > > > encapsulated
> > > > > > > > > as
> > > > > > > > > > a
> > > > > > > > > > > > library.
> > > > > > > > > > > >
> > > > > > > > > > > > @Becket
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for sharing your thought on the final state.
> > > > Despite the
> > > > > > > > > > details how
> > > > > > > > > > > > the interfaces should look like, I think this is a
> really
> > > > good
> > > > > > > > > > abstraction
> > > > > > > > > > > > for supporting general resource types.
> > > > > > > > > > > >
> > > > > > > > > > > > I'd like to further clarify that, the following three
> > > > things
> > > > > > are
> > > > > > > > all
> > > > > > > > > > that
> > > > > > > > > > > > the "Flink core" needs to understand.
> > > > > > > > > > > >
> > > > > > > > > > > > - The *amount* of resource, for scheduling.
> Actually,
> > > we
> > > > > > already
> > > > > > > > > > have
> > > > > > > > > > > > the Resource class in ResourceProfile and
> ResourceSpec
> > > > for
> > > > > > > > > extended
> > > > > > > > > > > > resource. It's just not really used.
> > > > > > > > > > > > - The *info*, that Flink provides to the
> operators /
> > > > user
> > > > > > codes.
> > > > > > > > > > > > - The *provider*, which generates the info based
> on
> > > the
> > > > > > amount.
> > > > > > > > > > > >
> > > > > > > > > > > > The "core" does not need to understand the specific
> > > > > > implementation
> > > > > > > > > > details
> > > > > > > > > > > > of the above three. They can even be implemented in a
> > > > 3rd-party
> > > > > > > > > > library.
> > > > > > > > > > > > Similar to how we allow users to define their custom
> > > > > > > > MetricReporter.
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you~
> > > > > > > > > > > >
> > > > > > > > > > > > Xintong Song
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the comment, Stephan.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - If everything becomes a "core feature", it will
> > > make
> > > > the
> > > > > > > > > project
> > > > > > > > > > hard
> > > > > > > > > > > > > > to develop in the future. Thinking "library" /
> > > > "plugin" /
> > > > > > > > > > "extension"
> > > > > > > > > > > > > style
> > > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Completely agree. It is much more important to
> design a
> > > > > > mechanism
> > > > > > > > > > than
> > > > > > > > > > > > > focusing on a specific case. Here is what I am
> thinking
> > > > to
> > > > > > fully
> > > > > > > > > > support
> > > > > > > > > > > > > custom resource management:
> > > > > > > > > > > > > 1. On the JM / RM side, use ResourceProfile and
> > > > ResourceSpec
> > > > > > to
> > > > > > > > > > define
> > > > > > > > > > > > the
> > > > > > > > > > > > > resource and the amount required. They will be
> used to
> > > > find
> > > > > > > > > suitable
> > > > > > > > > > TMs
> > > > > > > > > > > > > slots to run the tasks. At this point, the
> resources
> > > are
> > > > only
> > > > > > > > > > measured by
> > > > > > > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. On the TM side, have something like
> > > > > > *"ResourceInfoProvider"*
> > > > > > > > to
> > > > > > > > > > > > identify
> > > > > > > > > > > > > and provides the detail information of the
> individual
> > > > > > resource,
> > > > > > > > > e.g.
> > > > > > > > > > GPU
> > > > > > > > > > > > > ID.. It is important because the operator may have
> to
> > > > > > explicitly
> > > > > > > > > > interact
> > > > > > > > > > > > > with the physical resource it uses. The
> > > > ResourceInfoProvider
> > > > > > > > might
> > > > > > > > > > look
> > > > > > > > > > > > > like something below.
> > > > > > > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > > > > > > Map<AbstractID, INFO>
> > > retrieveResourceInfo(OperatorId
> > > > > > opId,
> > > > > > > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > - There could be several "*ResourceInfoProvider*"
> > > > configured
> > > > > > on
> > > > > > > > the
> > > > > > > > > > TM to
> > > > > > > > > > > > > retrieve the information for different resources.
> > > > > > > > > > > > > - The TM will be responsible to assign those
> individual
> > > > > > resources
> > > > > > > > > to
> > > > > > > > > > each
> > > > > > > > > > > > > operator according to their requested amount.
> > > > > > > > > > > > > - The operators will be able to get the
> ResourceInfo
> > > from
> > > > > > their
> > > > > > > > > > > > > RuntimeContext.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If we agree this is a reasonable final state. We
> can
> > > > adapt
> > > > > > the
> > > > > > > > > > current
> > > > > > > > > > > > FLIP
> > > > > > > > > > > > > to it. In fact it does not sound a big change to
> me.
> > > All
> > > > the
> > > > > > > > > proposed
> > > > > > > > > > > > > configuration can be as is, it is just that Flink
> > > itself
> > > > > > won't
> > > > > > > > care
> > > > > > > > > > about
> > > > > > > > > > > > > them, instead a GPUInfoProviver implementing the
> > > > > > > > > ResourceInfoProvider
> > > > > > > > > > > > will
> > > > > > > > > > > > > use them.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi all!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The main point I wanted to throw into the
> discussion
> > > > is the
> > > > > > > > > > following:
> > > > > > > > > > > > > > - With more and more use cases, more and more
> tools
> > > > go
> > > > > > into
> > > > > > > > > Flink
> > > > > > > > > > > > > > - If everything becomes a "core feature", it
> will
> > > > make
> > > > > > the
> > > > > > > > > > project
> > > > > > > > > > > > hard
> > > > > > > > > > > > > > to develop in the future. Thinking "library" /
> > > > "plugin" /
> > > > > > > > > > "extension"
> > > > > > > > > > > > > style
> > > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - A good thought experiment is always: How many
> > > > future
> > > > > > > > > developers
> > > > > > > > > > > > have
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > interact with this code (and possibly understand
> it
> > > > > > partially),
> > > > > > > > > > even if
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > features they touch have nothing to do with GPU
> > > > support. If
> > > > > > > > many
> > > > > > > > > > > > > > contributors to unrelated features will have to
> touch
> > > > it
> > > > > > and
> > > > > > > > > > understand
> > > > > > > > > > > > > it,
> > > > > > > > > > > > > > then let's think if there is a different
> solution.
> > > > Maybe
> > > > > > there
> > > > > > > > is
> > > > > > > > > > not,
> > > > > > > > > > > > > but
> > > > > > > > > > > > > > then we should be sure why.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - That led me to raising this issue: If the GPU
> > > > manager
> > > > > > > > > becomes a
> > > > > > > > > > > > core
> > > > > > > > > > > > > > service in the TaskManager, Environment,
> > > > RuntimeContext,
> > > > > > etc.
> > > > > > > > > then
> > > > > > > > > > > > > everyone
> > > > > > > > > > > > > > developing TM and streaming tasks need to
> understand
> > > > the
> > > > > > GPU
> > > > > > > > > > manager.
> > > > > > > > > > > > > That
> > > > > > > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Access to configuration seems not the right
> reason to
> > > > do
> > > > > > that.
> > > > > > > > We
> > > > > > > > > > > > should
> > > > > > > > > > > > > > expose the Flink configuration from the
> > > RuntimeContext
> > > > > > anyways.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If GPUs are sliced and assigned during
> scheduling,
> > > > there
> > > > > > may be
> > > > > > > > > > reason,
> > > > > > > > > > > > > > although it looks that it would belong to the
> slot
> > > > then. Is
> > > > > > > > that
> > > > > > > > > > what
> > > > > > > > > > > > we
> > > > > > > > > > > > > > are doing here?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IMO, eventually an operator should only see
> info of
> > > > GPUs
> > > > > > that
> > > > > > > > > are
> > > > > > > > > > > > > > dedicated
> > > > > > > > > > > > > > > for it, instead of all GPUs on the
> > > machine/container
> > > > in
> > > > > > the
> > > > > > > > > > current
> > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > It does not make sense to let the user who
> writes a
> > > > UDF
> > > > > > to
> > > > > > > > > worry
> > > > > > > > > > > > about
> > > > > > > > > > > > > > > coordination among multiple operators running
> on
> > > the
> > > > same
> > > > > > > > > > machine.
> > > > > > > > > > > > And
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > we want to limit the GPU info an operator
> sees, we
> > > > > > should not
> > > > > > > > > > let the
> > > > > > > > > > > > > > > operator to instantiate GPUManager, which
> means we
> > > > have
> > > > > > to
> > > > > > > > > expose
> > > > > > > > > > > > > > something
> > > > > > > > > > > > > > > through runtime context, either GPU info or
> some
> > > > kind of
> > > > > > > > > limited
> > > > > > > > > > > > access
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the GPUManager.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > It probably make sense for us to first agree
> on
> > > the
> > > > > > final
> > > > > > > > > > state.
> > > > > > > > > > > > More
> > > > > > > > > > > > > > > > specifically, will the resource info be
> exposed
> > > > through
> > > > > > > > > runtime
> > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > eventually?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If that is the final state and we have a
> seamless
> > > > > > migration
> > > > > > > > > > story
> > > > > > > > > > > > > from
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > FLIP to that final state, Personally I think
> it
> > > is
> > > > OK
> > > > > > to
> > > > > > > > > > expose the
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > info in the runtime context.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong
> Song <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > > > > > > I think what Stephan means (@Stephan,
> please
> > > > correct
> > > > > > me
> > > > > > > > if
> > > > > > > > > > I'm
> > > > > > > > > > > > > wrong)
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > that, we might not need to hold and
> maintain
> > > the
> > > > > > > > GPUManager
> > > > > > > > > > as a
> > > > > > > > > > > > > > > service
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > TaskManagerServices or RuntimeContext. An
> > > > > > alternative is
> > > > > > > > to
> > > > > > > > > > > > create
> > > > > > > > > > > > > /
> > > > > > > > > > > > > > > > > retrieve the GPUManager only in the
> operators
> > > > that
> > > > > > need
> > > > > > > > it,
> > > > > > > > > > e.g.,
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > > > > > > I agree with you on excluding GPUManager
> from
> > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - For the first step, where we provide
> > > unified
> > > > > > > > TM-level
> > > > > > > > > > GPU
> > > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > to all operators, it should be fine to
> have
> > > > > > operators
> > > > > > > > > > access /
> > > > > > > > > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > > > > > > > > - In future, we might have some more
> > > > fine-grained
> > > > > > GPU
> > > > > > > > > > > > > management,
> > > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > we need to maintain GPUManager as a
> service
> > > > and
> > > > > > put
> > > > > > > > GPU
> > > > > > > > > > info
> > > > > > > > > > > > in
> > > > > > > > > > > > > > slot
> > > > > > > > > > > > > > > > > profiles. But at least for now it's not
> > > > necessary
> > > > > > to
> > > > > > > > > > introduce
> > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > complexity.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > However, I have some concerns on excluding
> > > > GPUManager
> > > > > > > > from
> > > > > > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - Configurations needed for creating the
> > > > > > GPUManager is
> > > > > > > > > not
> > > > > > > > > > > > > always
> > > > > > > > > > > > > > > > > available for operators.
> > > > > > > > > > > > > > > > > - If later we want to have fine-grained
> > > > control
> > > > > > over
> > > > > > > > GPU
> > > > > > > > > > > > (e.g.,
> > > > > > > > > > > > > > > > > operators in each slot can only see GPUs
> > > > reserved
> > > > > > for
> > > > > > > > > that
> > > > > > > > > > > > > slot),
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I would suggest to wrap the GPUManager
> behind
> > > > > > > > > RuntimeContext
> > > > > > > > > > and
> > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > expose the GPUInfo to users. For now, we
> can
> > > > declare
> > > > > > a
> > > > > > > > > method
> > > > > > > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a
> > > default
> > > > > > > > definition
> > > > > > > > > > that
> > > > > > > > > > > > > > calls
> > > > > > > > > > > > > > > > > `GPUManager.get()` to get the
> lazily-created
> > > > > > GPUManager.
> > > > > > > > If
> > > > > > > > > > later
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > > to create / retrieve GPUManager in a
> different
> > > > way,
> > > > > > we
> > > > > > > > can
> > > > > > > > > > simply
> > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > how `getGPUInfo` is implemented, without
> > > needing
> > > > to
> > > > > > > > change
> > > > > > > > > > any
> > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze
> Guo <
> > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > > > > > > Do you mean Minicluster? Yes, it makes
> sense
> > > to
> > > > > > share
> > > > > > > > the
> > > > > > > > > > GPU
> > > > > > > > > > > > > > Manager
> > > > > > > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > > > > > > If that's what you worry about, I'm +1
> for
> > > > holding
> > > > > > > > > > > > > > > > > > GPUManager(ExternalResourceManagers) in
> > > > > > TaskExecutor
> > > > > > > > > > instead of
> > > > > > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regarding the
> RuntimeContext/FunctionContext,
> > > > it
> > > > > > just
> > > > > > > > > > holds the
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > info instead of the GPU Manager. AFAIK,
> it's
> > > > the
> > > > > > only
> > > > > > > > > > place we
> > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > pass GPU info to the
> > > > > > RichFunction/UserDefinedFunction.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac
> > > Godfried
> > > > <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > > > > > > > > [hidden email]
> > > > > > > > > > > > wrote
> > > > > > > > > > > > > > > ----
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Can we somehow keep this out of the
> > > > > > TaskManager
> > > > > > > > > > services
> > > > > > > > > > > > > > > > > > > > I fear that we could not. IMO, the
> > > > > > GPUManager(or
> > > > > > > > > > > > > > > > > > > > ExternalServicesManagers in future)
> is
> > > > > > conceptually
> > > > > > > > > > one of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > > manager services, just like
> MemoryManager
> > > > > > before
> > > > > > > > > 1.10.
> > > > > > > > > > > > > > > > > > > > - It maintains/holds the GPU
> resource at
> > > TM
> > > > > > level
> > > > > > > > and
> > > > > > > > > > all
> > > > > > > > > > > > of
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > operators allocate the GPU resources
> from
> > > > it.
> > > > > > So,
> > > > > > > > it
> > > > > > > > > > should
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > > > > > > > > - We could add a collection called
> > > > > > > > > > ExternalResourceManagers
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > hold
> > > > > > > > > > > > > > > > > > > > all managers of other external
> resources
> > > > in the
> > > > > > > > > future.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Can you help me understand why this
> needs
> > > the
> > > > > > > > addition
> > > > > > > > > in
> > > > > > > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > > > > > > Are you worried about the case when
> > > multiple
> > > > Task
> > > > > > > > > > Executors
> > > > > > > > > > > > run
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > JVM? That's not common, but wouldn't it
> > > > actually
> > > > > > be
> > > > > > > > > good
> > > > > > > > > > in
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > share the GPU Manager, given that the
> GPU
> > > is
> > > > > > shared?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > What parts need information about
> this?
> > > > > > > > > > > > > > > > > > > > In this FLIP, operators need the
> > > > information.
> > > > > > Thus,
> > > > > > > > > we
> > > > > > > > > > > > expose
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > information to the
> > > > > > RuntimeContext/FunctionContext.
> > > > > > > > > The
> > > > > > > > > > slot
> > > > > > > > > > > > > > > profile
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > not aware of GPU resources as GPU is
> TM
> > > > level
> > > > > > > > > resource
> > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Can the GPU Manager be a "self
> > > contained"
> > > > > > thing
> > > > > > > > > that
> > > > > > > > > > > > simply
> > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > configuration, and then abstracts
> > > > everything
> > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > > > Yes, we just pass the path/args of
> the
> > > > discover
> > > > > > > > > script
> > > > > > > > > > and
> > > > > > > > > > > > > how
> > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > GPUs per TM to it. It takes the
> > > > responsibility
> > > > > > to
> > > > > > > > get
> > > > > > > > > > the
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > information and expose them to the
> > > > > > > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > Operators. Meanwhile, we'd better not
> > > allow
> > > > > > > > operators
> > > > > > > > > > to
> > > > > > > > > > > > > > directly
> > > > > > > > > > > > > > > > > > > > access GPUManager, it should get what
> > > they
> > > > want
> > > > > > > > from
> > > > > > > > > > > > Context.
> > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > then decouple the
> > > interface/implementation
> > > > of
> > > > > > > > > > GPUManager
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > Public
> > > > > > > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM
> Stephan
> > > > Ewen <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > It sounds fine to initially start
> with
> > > > GPU
> > > > > > > > specific
> > > > > > > > > > > > support
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > generalizing this once we better
> > > > understand
> > > > > > the
> > > > > > > > > > space.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > About the implementation suggested
> in
> > > > > > FLIP-108:
> > > > > > > > > > > > > > > > > > > > > - Can we somehow keep this out of
> the
> > > > > > TaskManager
> > > > > > > > > > > > services?
> > > > > > > > > > > > > > > > > Anything
> > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > have to pull through all layers of
> the
> > > TM
> > > > > > makes
> > > > > > > > the
> > > > > > > > > > TM
> > > > > > > > > > > > > > > components
> > > > > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > - What parts need information about
> > > this?
> > > > > > > > > > > > > > > > > > > > > -> do the slot profiles need
> > > information
> > > > > > about
> > > > > > > > the
> > > > > > > > > > GPU?
> > > > > > > > > > > > > > > > > > > > > -> Can the GPU Manager be a "self
> > > > contained"
> > > > > > > > thing
> > > > > > > > > > that
> > > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > > > > the configuration, and then
> abstracts
> > > > > > everything
> > > > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > access it via "GPUManager.get()"
> or so?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM
> Yangze
> > > > Guo <
> > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo,
> > > you're
> > > > > > right,
> > > > > > > > > > I'll add
> > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > > > > > > Regarding the general extended
> > > resource
> > > > > > > > > mechanism,
> > > > > > > > > > I
> > > > > > > > > > > > > second
> > > > > > > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > > > > > - It's better to leverage
> > > > ResourceProfile
> > > > > > and
> > > > > > > > > > > > > ResourceSpec
> > > > > > > > > > > > > > > > after
> > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > supporting fine-grained GPU
> > > > scheduling. As
> > > > > > a
> > > > > > > > > first
> > > > > > > > > > step
> > > > > > > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > > > > > > prefer to not include it in the
> scope
> > > > of
> > > > > > this
> > > > > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > - Regarding the "Extended
> Resource
> > > > > > Manager",
> > > > > > > > if I
> > > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > > > > > > > correctly, it just a code
> refactoring
> > > > atm,
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > > > > extract
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >
> open/close/allocateExtendResources of
> > > > > > > > GPUManager
> > > > > > > > > to
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > > > > > > that is the case, +1 to do it
> during
> > > > > > > > > > implementation.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > > > > > > As Xintong said, we looked into
> how
> > > > Spark
> > > > > > > > > supports
> > > > > > > > > > a
> > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > Resource Scheduling" before and
> > > > decided to
> > > > > > > > > > introduce a
> > > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > >
> schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > > > > > > to make it more extensible. I
> think
> > > the
> > > > > > > > > "resource"
> > > > > > > > > > is a
> > > > > > > > > > > > > > > proper
> > > > > > > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > > > > > > to contain all the configs of
> > > extended
> > > > > > > > resources.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM
> > > Xingbo
> > > > > > Huang <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP,
> Yangze.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > There is no doubt that GPU
> resource
> > > > > > > > management
> > > > > > > > > > > > support
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > > > > > > facilitate the development of
> > > > AI-related
> > > > > > > > > > applications
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > I have only one comment about
> this
> > > > wiki:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Regarding the names of several
> GPU
> > > > > > > > > > configurations, I
> > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > delete the resource field
> makes it
> > > > > > consistent
> > > > > > > > > > with
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > names
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > > resource-related
> configurations in
> > > > > > > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > > > > > > ->
> > > > > > > > > > > > > > > > > > > > > > >
> > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Xintong Song <
> > > [hidden email]>
> > > > > > > > > > 于2020年3月4日周三
> > > > > > > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I
> also
> > > > had
> > > > > > an
> > > > > > > > > > offline
> > > > > > > > > > > > > > > discussion
> > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > > > > > > the "GPU Support" as some
> general
> > > > > > "Extended
> > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > Support".
> > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > > > > > > supporting extended
> resources in
> > > a
> > > > > > general
> > > > > > > > > > > > mechanism
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > > > > > and extensible way. The
> reason we
> > > > > > propose
> > > > > > > > > this
> > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > narrowing
> > > > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > > > > > > down to GPU alone, is mainly
> for
> > > > the
> > > > > > > > concern
> > > > > > > > > on
> > > > > > > > > > > > extra
> > > > > > > > > > > > > > > > efforts
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > > > > > > capacity needed for a general
> > > > > > mechanism.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > To come up with a well
> design on
> > > a
> > > > > > general
> > > > > > > > > > extended
> > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > > > mechanism, we would need to
> > > > investigate
> > > > > > > > more
> > > > > > > > > > on how
> > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > > > > > > kind of resources in
> practice.
> > > For
> > > > > > GPU, we
> > > > > > > > > > learnt
> > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > experts, Becket and his team
> > > > members.
> > > > > > But
> > > > > > > > for
> > > > > > > > > > FPGA,
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > > > > > > extended resources, we don't
> have
> > > > such
> > > > > > > > > > convenient
> > > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > > > > > > making the investigation
> requires
> > > > more
> > > > > > > > > efforts,
> > > > > > > > > > > > > which I
> > > > > > > > > > > > > > > > tend
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On the other hand, we also
> looked
> > > > into
> > > > > > how
> > > > > > > > > > Spark
> > > > > > > > > > > > > > > supports a
> > > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > > > Resource Scheduling".
> Assuming we
> > > > want
> > > > > > to
> > > > > > > > > have
> > > > > > > > > > a
> > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > resource mechanism in the
> future,
> > > > we
> > > > > > > > believe
> > > > > > > > > > that
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > > > design can be easily
> extended, in
> > > > an
> > > > > > > > > > incremental
> > > > > > > > > > > > way
> > > > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > > too
> > > > > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > - The most important part is
> > > > probably
> > > > > > user
> > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > > > > > > configuration options to
> define
> > > the
> > > > > > amount,
> > > > > > > > > > > > discovery
> > > > > > > > > > > > > > > > script
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > > > > > > k8s) in a per resource type
> bias
> > > > [1],
> > > > > > which
> > > > > > > > > is
> > > > > > > > > > very
> > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > > proposed in this FLIP. I
> think
> > > > it's not
> > > > > > > > > > necessary
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > > > in the general way atm,
> since we
> > > > do not
> > > > > > > > have
> > > > > > > > > > > > supports
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > > types now. If later we
> decided to
> > > > have
> > > > > > per
> > > > > > > > > > resource
> > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > > > > > > can have backwards
> compatibility
> > > > on the
> > > > > > > > > current
> > > > > > > > > > > > > > proposed
> > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > > > > > > - For the GPU Manager, if
> later
> > > > needed
> > > > > > we
> > > > > > > > can
> > > > > > > > > > > > change
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > > > > > > Resource Manager" (or
> whatever it
> > > > is
> > > > > > > > called).
> > > > > > > > > > That
> > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > > > > > > component-internal
> refactoring.
> > > > > > > > > > > > > > > > > > > > > > > > - For ResourceProfile and
> > > > ResourceSpec,
> > > > > > > > there
> > > > > > > > > > are
> > > > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > > > > > > general extended resource.
> We can
> > > > of
> > > > > > course
> > > > > > > > > > > > leverage
> > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > > > > > > fine grained GPU scheduling.
> That
> > > > is
> > > > > > also
> > > > > > > > not
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > > > > > step proposal, and would
> require
> > > > > > FLIP-56 to
> > > > > > > > > be
> > > > > > > > > > > > > finished
> > > > > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > To summary up, I agree with
> > > Becket
> > > > that
> > > > > > > > have
> > > > > > > > > a
> > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > general extended resource
> > > > mechanism,
> > > > > > and
> > > > > > > > keep
> > > > > > > > > > it in
> > > > > > > > > > > > > > mind
> > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > > > > > > and implementing the current
> one.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18
> AM
> > > > Becket
> > > > > > Qin <
> > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > That's a good point,
> Stephan.
> > > It
> > > > > > makes
> > > > > > > > > total
> > > > > > > > > > > > sense
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > resource management to
> support
> > > > custom
> > > > > > > > > > resources.
> > > > > > > > > > > > > > Having
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > > > > > to add new resources by
> > > > themselves.
> > > > > > The
> > > > > > > > > > general
> > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > > involve two different
> aspects:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > 1. The custom resource type
> > > > > > definition.
> > > > > > > > It
> > > > > > > > > is
> > > > > > > > > > > > > > supported
> > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > > resources in
> ResourceProfile
> > > and
> > > > > > > > > > ResourceSpec.
> > > > > > > > > > > > This
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > 2. The custom resource
> > > allocation
> > > > > > logic,
> > > > > > > > > > i.e. how
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > assign
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > > > > > > to different tasks,
> operators,
> > > > and
> > > > > > so on.
> > > > > > > > > > This
> > > > > > > > > > > > may
> > > > > > > > > > > > > > > > require
> > > > > > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > > > > > > a. Subtask level - make
> sure
> > > the
> > > > > > subtasks
> > > > > > > > > > are put
> > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > > > > > > It is done by the global
> RM and
> > > > is
> > > > > > not
> > > > > > > > > > > > customizable
> > > > > > > > > > > > > > > right
> > > > > > > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > > > > > b. Operator level - map the
> > > exact
> > > > > > > > resource
> > > > > > > > > > to the
> > > > > > > > > > > > > > > > operators
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU
> 2 for
> > > > > > operator
> > > > > > > > B.
> > > > > > > > > > This
> > > > > > > > > > > > > step
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > > > > > > the global RM does not
> > > > distinguish
> > > > > > > > > individual
> > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > > > > > > It is true for memory, but
> not
> > > > for
> > > > > > GPU.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > The GPU manager is
> designed to
> > > > do 2.b
> > > > > > > > here.
> > > > > > > > > > So it
> > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > > > > > > physical GPU information
> and
> > > > > > bind/match
> > > > > > > > > them
> > > > > > > > > > to
> > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > > > general will fill in the
> > > missing
> > > > > > piece to
> > > > > > > > > > support
> > > > > > > > > > > > > > > custom
> > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > > > > > definition. But I'd avoid
> > > > calling it
> > > > > > a
> > > > > > > > > > "External
> > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > > > > > > confusion with RM, maybe
> > > > something
> > > > > > like
> > > > > > > > > > "Operator
> > > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > > > > > be more accurate. So for
> each
> > > > > > resource
> > > > > > > > type
> > > > > > > > > > users
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > > > > > > "Operator Resource
> Assigner" in
> > > > the
> > > > > > TM.
> > > > > > > > For
> > > > > > > > > > > > memory,
> > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > > > > > > but for other extended
> > > resources,
> > > > > > users
> > > > > > > > may
> > > > > > > > > > need
> > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Personally I think a
> pluggable
> > > > > > "Operator
> > > > > > > > > > Resource
> > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > > > > > > in this FLIP. But I am
> also OK
> > > > with
> > > > > > > > having
> > > > > > > > > > that
> > > > > > > > > > > > in
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > > > > > > the interface between the
> > > > "Operator
> > > > > > > > > Resource
> > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > > take a while to settle
> down if
> > > we
> > > > > > want to
> > > > > > > > > > make it
> > > > > > > > > > > > > > > > generic.
> > > > > > > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > > > > > > implementation should take
> this
> > > > > > future
> > > > > > > > work
> > > > > > > > > > into
> > > > > > > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > > > > > > don't need to break
> backwards
> > > > > > > > compatibility
> > > > > > > > > > once
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at
> 12:27 AM
> > > > > > Stephan
> > > > > > > > > Ewen
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Thank you for writing
> this
> > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > I cannot really give much
> > > input
> > > > > > into
> > > > > > > > the
> > > > > > > > > > > > > mechanics
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > > > > > > and GPU allocation, as I
> have
> > > > no
> > > > > > > > > experience
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > One thought I had when
> > > reading
> > > > the
> > > > > > > > > > proposal is
> > > > > > > > > > > > if
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as an
> > > > "External
> > > > > > > > > Resource
> > > > > > > > > > > > > > Manager",
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > > > > > > The way I understand the
> > > > > > > > ResourceProfile
> > > > > > > > > > and
> > > > > > > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > > > > > > It has the advantage
> that it
> > > > looks
> > > > > > more
> > > > > > > > > > > > > extensible.
> > > > > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > Resource, a specialized
> > > NVIDIA
> > > > GPU
> > > > > > > > > > Resource,
> > > > > > > > > > > > and
> > > > > > > > > > > > > > FPGA
> > > > > > > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at
> 7:57
> > > AM
> > > > > > Becket
> > > > > > > > > Qin <
> > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP
> Yangze.
> > > > GPU
> > > > > > > > > resource
> > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > > > > > > for machine learning
> use
> > > > cases.
> > > > > > > > > Actually
> > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > > > > > > question from the
> users who
> > > > are
> > > > > > > > > > interested in
> > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Some quick comments /
> > > > questions
> > > > > > to
> > > > > > > > the
> > > > > > > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API
> > > > should
> > > > > > > > probably
> > > > > > > > > > also
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Is the data
> structure
> > > that
> > > > > > holds
> > > > > > > > GPU
> > > > > > > > > > info
> > > > > > > > > > > > > > also a
> > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at
> > > 10:15
> > > > AM
> > > > > > > > Xintong
> > > > > > > > > > Song
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for drafting
> the
> > > > FLIP
> > > > > > and
> > > > > > > > > > kicking
> > > > > > > > > > > > off
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this
> feature.
> > > > > > Supporting
> > > > > > > > > > using
> > > > > > > > > > > > of
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > especially for the ML
> > > > > > scenarios.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > I've reviewed the
> FLIP
> > > wiki
> > > > > > doc and
> > > > > > > > > it
> > > > > > > > > > > > looks
> > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > > > > > > very good first step
> for
> > > > > > Flink's
> > > > > > > > GPU
> > > > > > > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020
> at
> > > > 12:06 PM
> > > > > > > > > Yangze
> > > > > > > > > > Guo
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to
> start
> > > a
> > > > > > > > discussion
> > > > > > > > > > > > thread
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > support in
> Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly
> > > > discusses
> > > > > > the
> > > > > > > > > > following
> > > > > > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Enable user to
> > > > configure
> > > > > > how
> > > > > > > > many
> > > > > > > > > > GPUs
> > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > forward such
> > > > requirements to
> > > > > > the
> > > > > > > > > > external
> > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> Kubernetes/Yarn/Mesos
> > > > > > setups).
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Provide
> information
> > > of
> > > > > > > > available
> > > > > > > > > > GPU
> > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes
> proposed in
> > > > the
> > > > > > FLIP
> > > > > > > > > are
> > > > > > > > > > as
> > > > > > > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU
> resource
> > > > > > > > requirements
> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce
> GPUManager
> > > as
> > > > > > one of
> > > > > > > > > the
> > > > > > > > > > task
> > > > > > > > > > > > > > > manager
> > > > > > > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > and expose GPU
> resource
> > > > > > > > information
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce the
> default
> > > > > > script
> > > > > > > > for
> > > > > > > > > > GPU
> > > > > > > > > > > > > > > discovery,
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > the privilege mode
> to
> > > > help
> > > > > > user
> > > > > > > > to
> > > > > > > > > > > > achieve
> > > > > > > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more
> > > details
> > > > in
> > > > > > the
> > > > > > > > > FLIP
> > > > > > > > > > wiki
> > > > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > >
> > >
>

Yangze Guo

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Hi everyone,
I've updated the FLIP accordingly. The key change is replacing two
resource allocation interfaces to config options.

If there are no further comments, I would like to start a voting
thread by tomorrow.

Best,
Yangze Guo

On Mon, Mar 30, 2020 at 9:15 PM Till Rohrmann <[hidden email]> wrote:

>
> If there is no need for the ExternalResourceDriver on the RM side, then it
> is always a good idea to keep it simple and don't introduce it. One can
> always change things once one realizes that there is a need for it.
>
> Cheers,
> Till
>
> On Mon, Mar 30, 2020 at 12:00 PM Yangze Guo <[hidden email]> wrote:
>
> > Hi @Till, @Xintong
> >
> > I think even without the credential concerns, replacing the interfaces
> > with configuration options is a good idea from my side.
> > - Currently, I don't see any external resource does not compatible
> > with this mechanism
> > - It reduces the burden of users to implement a plugin themselves.
> > WDYT?
> >
> > Best,
> > Yangze Guo
> >
> > On Mon, Mar 30, 2020 at 5:44 PM Xintong Song <[hidden email]>
> > wrote:
> > >
> > > I also agree that the pluggable ExternalResourceDriver should be loaded
> > by
> > > the cluster class loader. Despite the plugin might be implemented by
> > users,
> > > external resources (as part of task executor resources) should be cluster
> > > configurations, unlike job-level user codes such as UDFs, because the
> > task
> > > executors belongs to the cluster rather than jobs.
> > >
> > >
> > > IIUC, the concern Stephan raised is about the potential credential
> > problem
> > > when executing user codes on RM with cluster class loader. The concern
> > > makes sense to me, and I think what Yangze suggested should be a good
> > > approach trying to prevent such credential problems. The only purpose we
> > > tried to execute user codes (i.e. getKubernetes/YarnExternalResource) on
> > RM
> > > was that, we need to set these key-value pairs to pod/container requests.
> > > Replacing the interfaces getKubernetes/YarnExternalResource with
> > > configuration options
> > > 'external-resource.{resourceName}.yarn/kubernetes.key/amount',
> > > we can still fulfill that purpose, without the credential risks.
> > >
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Mon, Mar 30, 2020 at 5:17 PM Till Rohrmann <[hidden email]>
> > wrote:
> > >
> > > > At the moment the RM does not have a user code class loader and I agree
> > > > with Stephan that it should stay like this. This, however, does not
> > mean
> > > > that we cannot support pluggable components in the RM. As long as the
> > > > plugins are on the system's class path, it should be fine for the RM to
> > > > load them. For example, we could add external resources via Flink's
> > plugin
> > > > mechanism or something similar.
> > > >
> > > > A very simple implementation of such an ExternalResourceDriver could
> > be a
> > > > class which simply returns what is written in the flink-conf.yaml
> > under a
> > > > given key.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Mon, Mar 30, 2020 at 5:39 AM Yangze Guo <[hidden email]> wrote:
> > > >
> > > > > Hi, Stephan,
> > > > >
> > > > > I see your concern and I totally agree with you.
> > > > >
> > > > > The interface on RM side is now `Map<String key, String/Long value>
> > > > > getYarn/KubernetesExternalResource()`. The only valid information RM
> > > > > get from it is the configuration key of that external resource in
> > > > > Yarn/K8s. The "String/Long value" would be the same as the
> > > > > external-resource.{resourceName}.amount.
> > > > > So, I think it makes sense to replace these two interfaces with two
> > > > > configs, i.e. external-resource.{resourceName}.yarn/kubernetes.key.
> > We
> > > > > may lose some extensibility, but AFAIK it could work with common
> > > > > external resources like GPU, FPGA. WDYT?
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Fri, Mar 27, 2020 at 7:59 PM Stephan Ewen <[hidden email]>
> > wrote:
> > > > > >
> > > > > > Maybe one final comment: It is probably not an issue, but let's
> > try and
> > > > > > keep user code (via user code classloader) out of the
> > ResourceManager,
> > > > if
> > > > > > possible.
> > > > > >
> > > > > > As background:
> > > > > >
> > > > > > There were thoughts in the past to support setups where the RM
> > must run
> > > > > > with "superuser" credentials, but we cannot run JM/TM with these
> > > > > > credentials, as the user code might access them otherwise.
> > > > > > This is actually possible today, you can run the RM in a different
> > JVM
> > > > or
> > > > > > in a different container, and give it more credentials than JMs /
> > TMs.
> > > > > But
> > > > > > for this to be feasible, we cannot allow any user-defined code to
> > be in
> > > > > the
> > > > > > JVM, because that instantaneously breaks the isolation of
> > credentials.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]>
> > wrote:
> > > > > >
> > > > > > > Thanks for the feedback, @Till and @Xintong.
> > > > > > >
> > > > > > > Regarding separating the interface, I'm also +1 with it.
> > > > > > >
> > > > > > > Regarding the resource allocation interface, true, it's
> > dangerous to
> > > > > > > give much access to user codes. Changing the return type to
> > > > Map<String
> > > > > > > key, String/Long value> makes sense to me. AFAIK, it is
> > compatible
> > > > > > > with all the first-party supported resources for
> > Yarn/Kubernetes. It
> > > > > > > could also free us from the potential dependency issue as well.
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <
> > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Thanks for updating the FLIP, Yangze.
> > > > > > > >
> > > > > > > > I agree with Till that we probably want to separate the
> > K8s/Yarn
> > > > > > > decorator
> > > > > > > > calls. Users can still configure one driver class, and we can
> > use
> > > > > > > > `instanceof` to check whether the driver implemented K8s/Yarn
> > > > > specific
> > > > > > > > interfaces.
> > > > > > > >
> > > > > > > > Moreover, I'm not sure about exposing entire
> > `ContainerRequest` /
> > > > > `Pod`
> > > > > > > > (`AbstractKubernetesStepDecorator` directly manipulates on
> > `Pod`)
> > > > to
> > > > > user
> > > > > > > > codes. It gives more access to user codes than needed for
> > defining
> > > > > > > external
> > > > > > > > resource, which might cause problems. Instead, I would suggest
> > to
> > > > > have
> > > > > > > > interface like `Map<String key, String value>
> > > > > > > > getYarn/KubernetesExternalResource()` and assemble them into
> > > > > > > > `ContainerRequest` / `Pod` in Yarn/KubernetesResourceManager.
> > > > > > > >
> > > > > > > > Thank you~
> > > > > > > >
> > > > > > > > Xintong Song
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <
> > > > [hidden email]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > I'm a bit late to the party. I think the current proposal
> > looks
> > > > > good.
> > > > > > > > >
> > > > > > > > > Concerning the ExternalResourceDriver interface defined in
> > the
> > > > FLIP
> > > > > > > [1], I
> > > > > > > > > would suggest to not include the decorator calls for
> > Kubernetes
> > > > and
> > > > > > > Yarn in
> > > > > > > > > the base interface. Instead I would suggest to segregate the
> > > > > deployment
> > > > > > > > > specific decorator calls into separate interfaces. That way
> > an
> > > > > > > > > ExternalResourceDriver does not have to support all
> > deployments
> > > > > from
> > > > > > > the
> > > > > > > > > very beginning. Moreover, some resources might not be
> > supported
> > > > by
> > > > > a
> > > > > > > > > specific deployment target and the natural way to express
> > this
> > > > > would
> > > > > > > be to
> > > > > > > > > not implement the respective deployment specific interface.
> > > > > > > > >
> > > > > > > > > Moreover, having void
> > > > > > > > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> > > > > > > containerRequest)
> > > > > > > > > in the ExternalResourceDriver interface would require Hadoop
> > on
> > > > > Flink's
> > > > > > > > > classpath whenever the external resource driver is being
> > used.
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Till
> > > > > > > > >
> > > > > > > > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <
> > [hidden email]>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Nice, thanks a lot!
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <
> > > > [hidden email]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks for the suggestion, @Stephan, @Becket and
> > @Xintong.
> > > > > > > > > > >
> > > > > > > > > > > I've updated the FLIP accordingly. I do not add a
> > > > > > > > > > > ResourceInfoProvider. Instead, I introduce the
> > > > > > > ExternalResourceDriver,
> > > > > > > > > > > which takes the responsibility of all relevant
> > operations on
> > > > > both
> > > > > > > RM
> > > > > > > > > > > and TM sides.
> > > > > > > > > > > After a rethink about decoupling the management of
> > external
> > > > > > > resources
> > > > > > > > > > > from TaskExecutor, I think we could do the same thing on
> > the
> > > > > > > > > > > ResourceManager side. We do not need to add a specific
> > > > > allocation
> > > > > > > > > > > logic to the ResourceManager each time we add a specific
> > > > > external
> > > > > > > > > > > resource.
> > > > > > > > > > > - For Yarn, we need the ExternalResourceDriver to edit
> > the
> > > > > > > > > > > containerRequest.
> > > > > > > > > > > - For Kubenetes, ExternalResourceDriver could provide a
> > > > > decorator
> > > > > > > for
> > > > > > > > > > > the TM pod.
> > > > > > > > > > >
> > > > > > > > > > > In this way, just like MetricReporter, we allow users to
> > > > define
> > > > > > > their
> > > > > > > > > > > custom ExternalResourceDriver. It is more extensible and
> > fits
> > > > > the
> > > > > > > > > > > separation of concerns. For more details, please take a
> > look
> > > > at
> > > > > > > [1].
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <
> > > > [hidden email]
> > > > > >
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > This sounds good to go ahead from my side.
> > > > > > > > > > > >
> > > > > > > > > > > > I like the approach that Becket suggested - in that
> > case
> > > > the
> > > > > core
> > > > > > > > > > > > abstraction that everyone would need to understand
> > would be
> > > > > > > "external
> > > > > > > > > > > > resource allocation" and the "ResourceInfoProvider",
> > and
> > > > the
> > > > > GPU
> > > > > > > > > > specific
> > > > > > > > > > > > code would be a specific implementation only known to
> > that
> > > > > > > component
> > > > > > > > > > that
> > > > > > > > > > > > allocates the external resource. That fits the
> > separation
> > > > of
> > > > > > > concerns
> > > > > > > > > > > well.
> > > > > > > > > > > >
> > > > > > > > > > > > I also understand that it should not be
> > over-engineered in
> > > > > the
> > > > > > > first
> > > > > > > > > > > > version, so some simplification makes sense, and then
> > > > > gradually
> > > > > > > > > expand
> > > > > > > > > > > from
> > > > > > > > > > > > there.
> > > > > > > > > > > >
> > > > > > > > > > > > So +1 to go ahead with what was suggested above
> > (Xintong /
> > > > > > > Becket)
> > > > > > > > > from
> > > > > > > > > > > my
> > > > > > > > > > > > side.
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> > > > > > > [hidden email]>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Stephan
> > > > > > > > > > > > >
> > > > > > > > > > > > > I see your concern, and I completely agree with you
> > that
> > > > we
> > > > > > > should
> > > > > > > > > > > first
> > > > > > > > > > > > > think about the "library" / "plugin" / "extension"
> > style
> > > > if
> > > > > > > > > possible.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If GPUs are sliced and assigned during scheduling,
> > there
> > > > > may be
> > > > > > > > > > reason,
> > > > > > > > > > > > > > although it looks that it would belong to the slot
> > > > then.
> > > > > Is
> > > > > > > that
> > > > > > > > > > > what we
> > > > > > > > > > > > > > are doing here?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > In the current proposal, we do not have the GPUs
> > sliced
> > > > and
> > > > > > > > > assigned
> > > > > > > > > > to
> > > > > > > > > > > > > slots, because it could be problematic without
> > dynamic
> > > > slot
> > > > > > > > > > allocation.
> > > > > > > > > > > > > E.g., the number of GPUs might not be evenly
> > divisible by
> > > > > the
> > > > > > > > > number
> > > > > > > > > > of
> > > > > > > > > > > > > slots.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think it makes sense to eventually have the GPUs
> > > > > assigned to
> > > > > > > > > slots.
> > > > > > > > > > > Even
> > > > > > > > > > > > > then, we might still need a TM level GPUManager (or
> > > > > > > > > ResourceProvider
> > > > > > > > > > > like
> > > > > > > > > > > > > Becket suggested). For memory, in each slot we can
> > simply
> > > > > > > request
> > > > > > > > > the
> > > > > > > > > > > > > amount of memory, leaving it to JVM / OS to decide
> > which
> > > > > memory
> > > > > > > > > > > (address)
> > > > > > > > > > > > > should be assigned. For GPU, and potentially other
> > > > > resources
> > > > > > > like
> > > > > > > > > > > FPGA, we
> > > > > > > > > > > > > need to explicitly specify which GPU (index) should
> > be
> > > > > used.
> > > > > > > > > > > Therefore, we
> > > > > > > > > > > > > need some component at the TM level to coordinate
> > which
> > > > > slot
> > > > > > > uses
> > > > > > > > > > which
> > > > > > > > > > > > > GPU.
> > > > > > > > > > > > >
> > > > > > > > > > > > > IMO, unless we say Flink will not support slot-level
> > GPU
> > > > > > > slicing at
> > > > > > > > > > > least
> > > > > > > > > > > > > in the foreseeable future, I don't see a good way to
> > > > avoid
> > > > > > > touching
> > > > > > > > > > > the TM
> > > > > > > > > > > > > core. To that end, I think Becket's suggestion
> > points to
> > > > a
> > > > > good
> > > > > > > > > > > direction,
> > > > > > > > > > > > > that supports more features (GPU, FPGA, etc.) with
> > less
> > > > > > > coupling to
> > > > > > > > > > > the TM
> > > > > > > > > > > > > core (only needs to understand the general
> > interfaces).
> > > > The
> > > > > > > > > detailed
> > > > > > > > > > > > > implementation for specific resource types can even
> > be
> > > > > > > encapsulated
> > > > > > > > > > as
> > > > > > > > > > > a
> > > > > > > > > > > > > library.
> > > > > > > > > > > > >
> > > > > > > > > > > > > @Becket
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for sharing your thought on the final state.
> > > > > Despite the
> > > > > > > > > > > details how
> > > > > > > > > > > > > the interfaces should look like, I think this is a
> > really
> > > > > good
> > > > > > > > > > > abstraction
> > > > > > > > > > > > > for supporting general resource types.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'd like to further clarify that, the following three
> > > > > things
> > > > > > > are
> > > > > > > > > all
> > > > > > > > > > > that
> > > > > > > > > > > > > the "Flink core" needs to understand.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - The *amount* of resource, for scheduling.
> > Actually,
> > > > we
> > > > > > > already
> > > > > > > > > > > have
> > > > > > > > > > > > > the Resource class in ResourceProfile and
> > ResourceSpec
> > > > > for
> > > > > > > > > > extended
> > > > > > > > > > > > > resource. It's just not really used.
> > > > > > > > > > > > > - The *info*, that Flink provides to the
> > operators /
> > > > > user
> > > > > > > codes.
> > > > > > > > > > > > > - The *provider*, which generates the info based
> > on
> > > > the
> > > > > > > amount.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The "core" does not need to understand the specific
> > > > > > > implementation
> > > > > > > > > > > details
> > > > > > > > > > > > > of the above three. They can even be implemented in a
> > > > > 3rd-party
> > > > > > > > > > > library.
> > > > > > > > > > > > > Similar to how we allow users to define their custom
> > > > > > > > > MetricReporter.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> > > > > > > [hidden email]>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for the comment, Stephan.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - If everything becomes a "core feature", it will
> > > > make
> > > > > the
> > > > > > > > > > project
> > > > > > > > > > > hard
> > > > > > > > > > > > > > > to develop in the future. Thinking "library" /
> > > > > "plugin" /
> > > > > > > > > > > "extension"
> > > > > > > > > > > > > > style
> > > > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Completely agree. It is much more important to
> > design a
> > > > > > > mechanism
> > > > > > > > > > > than
> > > > > > > > > > > > > > focusing on a specific case. Here is what I am
> > thinking
> > > > > to
> > > > > > > fully
> > > > > > > > > > > support
> > > > > > > > > > > > > > custom resource management:
> > > > > > > > > > > > > > 1. On the JM / RM side, use ResourceProfile and
> > > > > ResourceSpec
> > > > > > > to
> > > > > > > > > > > define
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > resource and the amount required. They will be
> > used to
> > > > > find
> > > > > > > > > > suitable
> > > > > > > > > > > TMs
> > > > > > > > > > > > > > slots to run the tasks. At this point, the
> > resources
> > > > are
> > > > > only
> > > > > > > > > > > measured by
> > > > > > > > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. On the TM side, have something like
> > > > > > > *"ResourceInfoProvider"*
> > > > > > > > > to
> > > > > > > > > > > > > identify
> > > > > > > > > > > > > > and provides the detail information of the
> > individual
> > > > > > > resource,
> > > > > > > > > > e.g.
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > ID.. It is important because the operator may have
> > to
> > > > > > > explicitly
> > > > > > > > > > > interact
> > > > > > > > > > > > > > with the physical resource it uses. The
> > > > > ResourceInfoProvider
> > > > > > > > > might
> > > > > > > > > > > look
> > > > > > > > > > > > > > like something below.
> > > > > > > > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > > > > > > > Map<AbstractID, INFO>
> > > > retrieveResourceInfo(OperatorId
> > > > > > > opId,
> > > > > > > > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > > > > > > > }
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - There could be several "*ResourceInfoProvider*"
> > > > > configured
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > > TM to
> > > > > > > > > > > > > > retrieve the information for different resources.
> > > > > > > > > > > > > > - The TM will be responsible to assign those
> > individual
> > > > > > > resources
> > > > > > > > > > to
> > > > > > > > > > > each
> > > > > > > > > > > > > > operator according to their requested amount.
> > > > > > > > > > > > > > - The operators will be able to get the
> > ResourceInfo
> > > > from
> > > > > > > their
> > > > > > > > > > > > > > RuntimeContext.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If we agree this is a reasonable final state. We
> > can
> > > > > adapt
> > > > > > > the
> > > > > > > > > > > current
> > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > to it. In fact it does not sound a big change to
> > me.
> > > > All
> > > > > the
> > > > > > > > > > proposed
> > > > > > > > > > > > > > configuration can be as is, it is just that Flink
> > > > itself
> > > > > > > won't
> > > > > > > > > care
> > > > > > > > > > > about
> > > > > > > > > > > > > > them, instead a GPUInfoProviver implementing the
> > > > > > > > > > ResourceInfoProvider
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > use them.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> > > > > > > [hidden email]>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi all!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The main point I wanted to throw into the
> > discussion
> > > > > is the
> > > > > > > > > > > following:
> > > > > > > > > > > > > > > - With more and more use cases, more and more
> > tools
> > > > > go
> > > > > > > into
> > > > > > > > > > Flink
> > > > > > > > > > > > > > > - If everything becomes a "core feature", it
> > will
> > > > > make
> > > > > > > the
> > > > > > > > > > > project
> > > > > > > > > > > > > hard
> > > > > > > > > > > > > > > to develop in the future. Thinking "library" /
> > > > > "plugin" /
> > > > > > > > > > > "extension"
> > > > > > > > > > > > > > style
> > > > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - A good thought experiment is always: How many
> > > > > future
> > > > > > > > > > developers
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > interact with this code (and possibly understand
> > it
> > > > > > > partially),
> > > > > > > > > > > even if
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > features they touch have nothing to do with GPU
> > > > > support. If
> > > > > > > > > many
> > > > > > > > > > > > > > > contributors to unrelated features will have to
> > touch
> > > > > it
> > > > > > > and
> > > > > > > > > > > understand
> > > > > > > > > > > > > > it,
> > > > > > > > > > > > > > > then let's think if there is a different
> > solution.
> > > > > Maybe
> > > > > > > there
> > > > > > > > > is
> > > > > > > > > > > not,
> > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > then we should be sure why.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - That led me to raising this issue: If the GPU
> > > > > manager
> > > > > > > > > > becomes a
> > > > > > > > > > > > > core
> > > > > > > > > > > > > > > service in the TaskManager, Environment,
> > > > > RuntimeContext,
> > > > > > > etc.
> > > > > > > > > > then
> > > > > > > > > > > > > > everyone
> > > > > > > > > > > > > > > developing TM and streaming tasks need to
> > understand
> > > > > the
> > > > > > > GPU
> > > > > > > > > > > manager.
> > > > > > > > > > > > > > That
> > > > > > > > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Access to configuration seems not the right
> > reason to
> > > > > do
> > > > > > > that.
> > > > > > > > > We
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > expose the Flink configuration from the
> > > > RuntimeContext
> > > > > > > anyways.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If GPUs are sliced and assigned during
> > scheduling,
> > > > > there
> > > > > > > may be
> > > > > > > > > > > reason,
> > > > > > > > > > > > > > > although it looks that it would belong to the
> > slot
> > > > > then. Is
> > > > > > > > > that
> > > > > > > > > > > what
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > are doing here?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song <
> > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > IMO, eventually an operator should only see
> > info of
> > > > > GPUs
> > > > > > > that
> > > > > > > > > > are
> > > > > > > > > > > > > > > dedicated
> > > > > > > > > > > > > > > > for it, instead of all GPUs on the
> > > > machine/container
> > > > > in
> > > > > > > the
> > > > > > > > > > > current
> > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > It does not make sense to let the user who
> > writes a
> > > > > UDF
> > > > > > > to
> > > > > > > > > > worry
> > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > coordination among multiple operators running
> > on
> > > > the
> > > > > same
> > > > > > > > > > > machine.
> > > > > > > > > > > > > And
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > we want to limit the GPU info an operator
> > sees, we
> > > > > > > should not
> > > > > > > > > > > let the
> > > > > > > > > > > > > > > > operator to instantiate GPUManager, which
> > means we
> > > > > have
> > > > > > > to
> > > > > > > > > > expose
> > > > > > > > > > > > > > > something
> > > > > > > > > > > > > > > > through runtime context, either GPU info or
> > some
> > > > > kind of
> > > > > > > > > > limited
> > > > > > > > > > > > > access
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > the GPUManager.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > It probably make sense for us to first agree
> > on
> > > > the
> > > > > > > final
> > > > > > > > > > > state.
> > > > > > > > > > > > > More
> > > > > > > > > > > > > > > > > specifically, will the resource info be
> > exposed
> > > > > through
> > > > > > > > > > runtime
> > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > > eventually?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If that is the final state and we have a
> > seamless
> > > > > > > migration
> > > > > > > > > > > story
> > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > FLIP to that final state, Personally I think
> > it
> > > > is
> > > > > OK
> > > > > > > to
> > > > > > > > > > > expose the
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > info in the runtime context.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong
> > Song <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > > > > > > > I think what Stephan means (@Stephan,
> > please
> > > > > correct
> > > > > > > me
> > > > > > > > > if
> > > > > > > > > > > I'm
> > > > > > > > > > > > > > wrong)
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > that, we might not need to hold and
> > maintain
> > > > the
> > > > > > > > > GPUManager
> > > > > > > > > > > as a
> > > > > > > > > > > > > > > > service
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > TaskManagerServices or RuntimeContext. An
> > > > > > > alternative is
> > > > > > > > > to
> > > > > > > > > > > > > create
> > > > > > > > > > > > > > /
> > > > > > > > > > > > > > > > > > retrieve the GPUManager only in the
> > operators
> > > > > that
> > > > > > > need
> > > > > > > > > it,
> > > > > > > > > > > e.g.,
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > > > > > > > I agree with you on excluding GPUManager
> > from
> > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > - For the first step, where we provide
> > > > unified
> > > > > > > > > TM-level
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > to all operators, it should be fine to
> > have
> > > > > > > operators
> > > > > > > > > > > access /
> > > > > > > > > > > > > > > > > > lazy-initiate GPUManager by themselves.
> > > > > > > > > > > > > > > > > > - In future, we might have some more
> > > > > fine-grained
> > > > > > > GPU
> > > > > > > > > > > > > > management,
> > > > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > > we need to maintain GPUManager as a
> > service
> > > > > and
> > > > > > > put
> > > > > > > > > GPU
> > > > > > > > > > > info
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > slot
> > > > > > > > > > > > > > > > > > profiles. But at least for now it's not
> > > > > necessary
> > > > > > > to
> > > > > > > > > > > introduce
> > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > complexity.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > However, I have some concerns on excluding
> > > > > GPUManager
> > > > > > > > > from
> > > > > > > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > - Configurations needed for creating the
> > > > > > > GPUManager is
> > > > > > > > > > not
> > > > > > > > > > > > > > always
> > > > > > > > > > > > > > > > > > available for operators.
> > > > > > > > > > > > > > > > > > - If later we want to have fine-grained
> > > > > control
> > > > > > > over
> > > > > > > > > GPU
> > > > > > > > > > > > > (e.g.,
> > > > > > > > > > > > > > > > > > operators in each slot can only see GPUs
> > > > > reserved
> > > > > > > for
> > > > > > > > > > that
> > > > > > > > > > > > > > slot),
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I would suggest to wrap the GPUManager
> > behind
> > > > > > > > > > RuntimeContext
> > > > > > > > > > > and
> > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > expose the GPUInfo to users. For now, we
> > can
> > > > > declare
> > > > > > > a
> > > > > > > > > > method
> > > > > > > > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with a
> > > > default
> > > > > > > > > definition
> > > > > > > > > > > that
> > > > > > > > > > > > > > > calls
> > > > > > > > > > > > > > > > > > `GPUManager.get()` to get the
> > lazily-created
> > > > > > > GPUManager.
> > > > > > > > > If
> > > > > > > > > > > later
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > > > to create / retrieve GPUManager in a
> > different
> > > > > way,
> > > > > > > we
> > > > > > > > > can
> > > > > > > > > > > simply
> > > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > > how `getGPUInfo` is implemented, without
> > > > needing
> > > > > to
> > > > > > > > > change
> > > > > > > > > > > any
> > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze
> > Guo <
> > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > > > > > > > Do you mean Minicluster? Yes, it makes
> > sense
> > > > to
> > > > > > > share
> > > > > > > > > the
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > Manager
> > > > > > > > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > > > > > > > If that's what you worry about, I'm +1
> > for
> > > > > holding
> > > > > > > > > > > > > > > > > > > GPUManager(ExternalResourceManagers) in
> > > > > > > TaskExecutor
> > > > > > > > > > > instead of
> > > > > > > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Regarding the
> > RuntimeContext/FunctionContext,
> > > > > it
> > > > > > > just
> > > > > > > > > > > holds the
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > info instead of the GPU Manager. AFAIK,
> > it's
> > > > > the
> > > > > > > only
> > > > > > > > > > > place we
> > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > pass GPU info to the
> > > > > > > RichFunction/UserDefinedFunction.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac
> > > > Godfried
> > > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20 +0000
> > > > > > > > > > [hidden email]
> > > > > > > > > > > > > wrote
> > > > > > > > > > > > > > > > ----
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Can we somehow keep this out of the
> > > > > > > TaskManager
> > > > > > > > > > > services
> > > > > > > > > > > > > > > > > > > > > I fear that we could not. IMO, the
> > > > > > > GPUManager(or
> > > > > > > > > > > > > > > > > > > > > ExternalServicesManagers in future)
> > is
> > > > > > > conceptually
> > > > > > > > > > > one of
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > > > manager services, just like
> > MemoryManager
> > > > > > > before
> > > > > > > > > > 1.10.
> > > > > > > > > > > > > > > > > > > > > - It maintains/holds the GPU
> > resource at
> > > > TM
> > > > > > > level
> > > > > > > > > and
> > > > > > > > > > > all
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > operators allocate the GPU resources
> > from
> > > > > it.
> > > > > > > So,
> > > > > > > > > it
> > > > > > > > > > > should
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > exclusive to a single TaskExecutor.
> > > > > > > > > > > > > > > > > > > > > - We could add a collection called
> > > > > > > > > > > ExternalResourceManagers
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > hold
> > > > > > > > > > > > > > > > > > > > > all managers of other external
> > resources
> > > > > in the
> > > > > > > > > > future.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Can you help me understand why this
> > needs
> > > > the
> > > > > > > > > addition
> > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > > > > > > > Are you worried about the case when
> > > > multiple
> > > > > Task
> > > > > > > > > > > Executors
> > > > > > > > > > > > > run
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > JVM? That's not common, but wouldn't it
> > > > > actually
> > > > > > > be
> > > > > > > > > > good
> > > > > > > > > > > in
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > share the GPU Manager, given that the
> > GPU
> > > > is
> > > > > > > shared?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > What parts need information about
> > this?
> > > > > > > > > > > > > > > > > > > > > In this FLIP, operators need the
> > > > > information.
> > > > > > > Thus,
> > > > > > > > > > we
> > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > information to the
> > > > > > > RuntimeContext/FunctionContext.
> > > > > > > > > > The
> > > > > > > > > > > slot
> > > > > > > > > > > > > > > > profile
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > not aware of GPU resources as GPU is
> > TM
> > > > > level
> > > > > > > > > > resource
> > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Can the GPU Manager be a "self
> > > > contained"
> > > > > > > thing
> > > > > > > > > > that
> > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > configuration, and then abstracts
> > > > > everything
> > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > > > > Yes, we just pass the path/args of
> > the
> > > > > discover
> > > > > > > > > > script
> > > > > > > > > > > and
> > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > GPUs per TM to it. It takes the
> > > > > responsibility
> > > > > > > to
> > > > > > > > > get
> > > > > > > > > > > the
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > information and expose them to the
> > > > > > > > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > Operators. Meanwhile, we'd better not
> > > > allow
> > > > > > > > > operators
> > > > > > > > > > > to
> > > > > > > > > > > > > > > directly
> > > > > > > > > > > > > > > > > > > > > access GPUManager, it should get what
> > > > they
> > > > > want
> > > > > > > > > from
> > > > > > > > > > > > > Context.
> > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > > then decouple the
> > > > interface/implementation
> > > > > of
> > > > > > > > > > > GPUManager
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > Public
> > > > > > > > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM
> > Stephan
> > > > > Ewen <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > It sounds fine to initially start
> > with
> > > > > GPU
> > > > > > > > > specific
> > > > > > > > > > > > > support
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > > generalizing this once we better
> > > > > understand
> > > > > > > the
> > > > > > > > > > > space.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > About the implementation suggested
> > in
> > > > > > > FLIP-108:
> > > > > > > > > > > > > > > > > > > > > > - Can we somehow keep this out of
> > the
> > > > > > > TaskManager
> > > > > > > > > > > > > services?
> > > > > > > > > > > > > > > > > > Anything
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > have to pull through all layers of
> > the
> > > > TM
> > > > > > > makes
> > > > > > > > > the
> > > > > > > > > > > TM
> > > > > > > > > > > > > > > > components
> > > > > > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > - What parts need information about
> > > > this?
> > > > > > > > > > > > > > > > > > > > > > -> do the slot profiles need
> > > > information
> > > > > > > about
> > > > > > > > > the
> > > > > > > > > > > GPU?
> > > > > > > > > > > > > > > > > > > > > > -> Can the GPU Manager be a "self
> > > > > contained"
> > > > > > > > > thing
> > > > > > > > > > > that
> > > > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > > > > > the configuration, and then
> > abstracts
> > > > > > > everything
> > > > > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > > access it via "GPUManager.get()"
> > or so?
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM
> > Yangze
> > > > > Guo <
> > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > > > > > > > Regarding the WebUI and GPUInfo,
> > > > you're
> > > > > > > right,
> > > > > > > > > > > I'll add
> > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > > > > > > > Regarding the general extended
> > > > resource
> > > > > > > > > > mechanism,
> > > > > > > > > > > I
> > > > > > > > > > > > > > second
> > > > > > > > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > > > > > > - It's better to leverage
> > > > > ResourceProfile
> > > > > > > and
> > > > > > > > > > > > > > ResourceSpec
> > > > > > > > > > > > > > > > > after
> > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > supporting fine-grained GPU
> > > > > scheduling. As
> > > > > > > a
> > > > > > > > > > first
> > > > > > > > > > > step
> > > > > > > > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > > > > > > > prefer to not include it in the
> > scope
> > > > > of
> > > > > > > this
> > > > > > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > - Regarding the "Extended
> > Resource
> > > > > > > Manager",
> > > > > > > > > if I
> > > > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > > > > > > > > correctly, it just a code
> > refactoring
> > > > > atm,
> > > > > > > we
> > > > > > > > > > could
> > > > > > > > > > > > > > extract
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > >
> > open/close/allocateExtendResources of
> > > > > > > > > GPUManager
> > > > > > > > > > to
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > > > > > > > that is the case, +1 to do it
> > during
> > > > > > > > > > > implementation.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > > > > > > > As Xintong said, we looked into
> > how
> > > > > Spark
> > > > > > > > > > supports
> > > > > > > > > > > a
> > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > > Resource Scheduling" before and
> > > > > decided to
> > > > > > > > > > > introduce a
> > > > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > >
> > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > > > > > > > to make it more extensible. I
> > think
> > > > the
> > > > > > > > > > "resource"
> > > > > > > > > > > is a
> > > > > > > > > > > > > > > > proper
> > > > > > > > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > > > > > > > to contain all the configs of
> > > > extended
> > > > > > > > > resources.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48 AM
> > > > Xingbo
> > > > > > > Huang <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP,
> > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > There is no doubt that GPU
> > resource
> > > > > > > > > management
> > > > > > > > > > > > > support
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > > > > > > > facilitate the development of
> > > > > AI-related
> > > > > > > > > > > applications
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > I have only one comment about
> > this
> > > > > wiki:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Regarding the names of several
> > GPU
> > > > > > > > > > > configurations, I
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > delete the resource field
> > makes it
> > > > > > > consistent
> > > > > > > > > > > with
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > names
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > > > resource-related
> > configurations in
> > > > > > > > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > > > > > > > ->
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <
> > > > [hidden email]>
> > > > > > > > > > > 于2020年3月4日周三
> > > > > > > > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang and I
> > also
> > > > > had
> > > > > > > an
> > > > > > > > > > > offline
> > > > > > > > > > > > > > > > discussion
> > > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > > > > > > > the "GPU Support" as some
> > general
> > > > > > > "Extended
> > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > Support".
> > > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > > > > > > > supporting extended
> > resources in
> > > > a
> > > > > > > general
> > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > > > > > > and extensible way. The
> > reason we
> > > > > > > propose
> > > > > > > > > > this
> > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > narrowing
> > > > > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > > > > > > > down to GPU alone, is mainly
> > for
> > > > > the
> > > > > > > > > concern
> > > > > > > > > > on
> > > > > > > > > > > > > extra
> > > > > > > > > > > > > > > > > efforts
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > > > > > > > capacity needed for a general
> > > > > > > mechanism.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > To come up with a well
> > design on
> > > > a
> > > > > > > general
> > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > > > > mechanism, we would need to
> > > > > investigate
> > > > > > > > > more
> > > > > > > > > > > on how
> > > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > > > > > > > kind of resources in
> > practice.
> > > > For
> > > > > > > GPU, we
> > > > > > > > > > > learnt
> > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > experts, Becket and his team
> > > > > members.
> > > > > > > But
> > > > > > > > > for
> > > > > > > > > > > FPGA,
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > > > > > > > extended resources, we don't
> > have
> > > > > such
> > > > > > > > > > > convenient
> > > > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > > > > > > > making the investigation
> > requires
> > > > > more
> > > > > > > > > > efforts,
> > > > > > > > > > > > > > which I
> > > > > > > > > > > > > > > > > tend
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > On the other hand, we also
> > looked
> > > > > into
> > > > > > > how
> > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > supports a
> > > > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > > > > Resource Scheduling".
> > Assuming we
> > > > > want
> > > > > > > to
> > > > > > > > > > have
> > > > > > > > > > > a
> > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > > resource mechanism in the
> > future,
> > > > > we
> > > > > > > > > believe
> > > > > > > > > > > that
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > > > > design can be easily
> > extended, in
> > > > > an
> > > > > > > > > > > incremental
> > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > > > too
> > > > > > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > - The most important part is
> > > > > probably
> > > > > > > user
> > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > > > > > > > configuration options to
> > define
> > > > the
> > > > > > > amount,
> > > > > > > > > > > > > discovery
> > > > > > > > > > > > > > > > > script
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > > > > > > > k8s) in a per resource type
> > bias
> > > > > [1],
> > > > > > > which
> > > > > > > > > > is
> > > > > > > > > > > very
> > > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > > > proposed in this FLIP. I
> > think
> > > > > it's not
> > > > > > > > > > > necessary
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > > > > in the general way atm,
> > since we
> > > > > do not
> > > > > > > > > have
> > > > > > > > > > > > > supports
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > > > types now. If later we
> > decided to
> > > > > have
> > > > > > > per
> > > > > > > > > > > resource
> > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > > > > > > > can have backwards
> > compatibility
> > > > > on the
> > > > > > > > > > current
> > > > > > > > > > > > > > > proposed
> > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > > > > > > > - For the GPU Manager, if
> > later
> > > > > needed
> > > > > > > we
> > > > > > > > > can
> > > > > > > > > > > > > change
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > > > > > > > Resource Manager" (or
> > whatever it
> > > > > is
> > > > > > > > > called).
> > > > > > > > > > > That
> > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > > > > > > > component-internal
> > refactoring.
> > > > > > > > > > > > > > > > > > > > > > > > > - For ResourceProfile and
> > > > > ResourceSpec,
> > > > > > > > > there
> > > > > > > > > > > are
> > > > > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > > > > > > > general extended resource.
> > We can
> > > > > of
> > > > > > > course
> > > > > > > > > > > > > leverage
> > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > > > > > > > fine grained GPU scheduling.
> > That
> > > > > is
> > > > > > > also
> > > > > > > > > not
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > > > > > > step proposal, and would
> > require
> > > > > > > FLIP-56 to
> > > > > > > > > > be
> > > > > > > > > > > > > > finished
> > > > > > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > To summary up, I agree with
> > > > Becket
> > > > > that
> > > > > > > > > have
> > > > > > > > > > a
> > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > general extended resource
> > > > > mechanism,
> > > > > > > and
> > > > > > > > > keep
> > > > > > > > > > > it in
> > > > > > > > > > > > > > > mind
> > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > > > > > > > and implementing the current
> > one.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 9:18
> > AM
> > > > > Becket
> > > > > > > Qin <
> > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > That's a good point,
> > Stephan.
> > > > It
> > > > > > > makes
> > > > > > > > > > total
> > > > > > > > > > > > > sense
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > resource management to
> > support
> > > > > custom
> > > > > > > > > > > resources.
> > > > > > > > > > > > > > > Having
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > > > > > > to add new resources by
> > > > > themselves.
> > > > > > > The
> > > > > > > > > > > general
> > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > > > involve two different
> > aspects:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > 1. The custom resource type
> > > > > > > definition.
> > > > > > > > > It
> > > > > > > > > > is
> > > > > > > > > > > > > > > supported
> > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > > > resources in
> > ResourceProfile
> > > > and
> > > > > > > > > > > ResourceSpec.
> > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > 2. The custom resource
> > > > allocation
> > > > > > > logic,
> > > > > > > > > > > i.e. how
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > assign
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > > > > > > > to different tasks,
> > operators,
> > > > > and
> > > > > > > so on.
> > > > > > > > > > > This
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > require
> > > > > > > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > > > > > > > a. Subtask level - make
> > sure
> > > > the
> > > > > > > subtasks
> > > > > > > > > > > are put
> > > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > > > > > > > It is done by the global
> > RM and
> > > > > is
> > > > > > > not
> > > > > > > > > > > > > customizable
> > > > > > > > > > > > > > > > right
> > > > > > > > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > > > > > > b. Operator level - map the
> > > > exact
> > > > > > > > > resource
> > > > > > > > > > > to the
> > > > > > > > > > > > > > > > > operators
> > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > > > > > > > GPU 1 for operator A, GPU
> > 2 for
> > > > > > > operator
> > > > > > > > > B.
> > > > > > > > > > > This
> > > > > > > > > > > > > > step
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > > > > > > > the global RM does not
> > > > > distinguish
> > > > > > > > > > individual
> > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > > > > > > > It is true for memory, but
> > not
> > > > > for
> > > > > > > GPU.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > The GPU manager is
> > designed to
> > > > > do 2.b
> > > > > > > > > here.
> > > > > > > > > > > So it
> > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > > > > > > > physical GPU information
> > and
> > > > > > > bind/match
> > > > > > > > > > them
> > > > > > > > > > > to
> > > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > > > > general will fill in the
> > > > missing
> > > > > > > piece to
> > > > > > > > > > > support
> > > > > > > > > > > > > > > > custom
> > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > > > > > > definition. But I'd avoid
> > > > > calling it
> > > > > > > a
> > > > > > > > > > > "External
> > > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > > > > > > > confusion with RM, maybe
> > > > > something
> > > > > > > like
> > > > > > > > > > > "Operator
> > > > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > > > > > > be more accurate. So for
> > each
> > > > > > > resource
> > > > > > > > > type
> > > > > > > > > > > users
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > > > > > > > "Operator Resource
> > Assigner" in
> > > > > the
> > > > > > > TM.
> > > > > > > > > For
> > > > > > > > > > > > > memory,
> > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > > > > > > > but for other extended
> > > > resources,
> > > > > > > users
> > > > > > > > > may
> > > > > > > > > > > need
> > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Personally I think a
> > pluggable
> > > > > > > "Operator
> > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > > > > > > > in this FLIP. But I am
> > also OK
> > > > > with
> > > > > > > > > having
> > > > > > > > > > > that
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > > > > > > > the interface between the
> > > > > "Operator
> > > > > > > > > > Resource
> > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > > > take a while to settle
> > down if
> > > > we
> > > > > > > want to
> > > > > > > > > > > make it
> > > > > > > > > > > > > > > > > generic.
> > > > > > > > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > > > > > > > implementation should take
> > this
> > > > > > > future
> > > > > > > > > work
> > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > > > > > > > don't need to break
> > backwards
> > > > > > > > > compatibility
> > > > > > > > > > > once
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at
> > 12:27 AM
> > > > > > > Stephan
> > > > > > > > > > Ewen
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you for writing
> > this
> > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > I cannot really give much
> > > > input
> > > > > > > into
> > > > > > > > > the
> > > > > > > > > > > > > > mechanics
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > > > > > > > and GPU allocation, as I
> > have
> > > > > no
> > > > > > > > > > experience
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > One thought I had when
> > > > reading
> > > > > the
> > > > > > > > > > > proposal is
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as an
> > > > > "External
> > > > > > > > > > Resource
> > > > > > > > > > > > > > > Manager",
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > > > > > > > The way I understand the
> > > > > > > > > ResourceProfile
> > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > > > > > > > It has the advantage
> > that it
> > > > > looks
> > > > > > > more
> > > > > > > > > > > > > > extensible.
> > > > > > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > Resource, a specialized
> > > > NVIDIA
> > > > > GPU
> > > > > > > > > > > Resource,
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > FPGA
> > > > > > > > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at
> > 7:57
> > > > AM
> > > > > > > Becket
> > > > > > > > > > Qin <
> > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP
> > Yangze.
> > > > > GPU
> > > > > > > > > > resource
> > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > > > > > > > for machine learning
> > use
> > > > > cases.
> > > > > > > > > > Actually
> > > > > > > > > > > it
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > > > > > > > question from the
> > users who
> > > > > are
> > > > > > > > > > > interested in
> > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Some quick comments /
> > > > > questions
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The WebUI / REST API
> > > > > should
> > > > > > > > > probably
> > > > > > > > > > > also
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Is the data
> > structure
> > > > that
> > > > > > > holds
> > > > > > > > > GPU
> > > > > > > > > > > info
> > > > > > > > > > > > > > > also a
> > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020 at
> > > > 10:15
> > > > > AM
> > > > > > > > > Xintong
> > > > > > > > > > > Song
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for drafting
> > the
> > > > > FLIP
> > > > > > > and
> > > > > > > > > > > kicking
> > > > > > > > > > > > > off
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this
> > feature.
> > > > > > > Supporting
> > > > > > > > > > > using
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > especially for the ML
> > > > > > > scenarios.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've reviewed the
> > FLIP
> > > > wiki
> > > > > > > doc and
> > > > > > > > > > it
> > > > > > > > > > > > > looks
> > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > very good first step
> > for
> > > > > > > Flink's
> > > > > > > > > GPU
> > > > > > > > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2, 2020
> > at
> > > > > 12:06 PM
> > > > > > > > > > Yangze
> > > > > > > > > > > Guo
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like to
> > start
> > > > a
> > > > > > > > > discussion
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > support in
> > Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP mainly
> > > > > discusses
> > > > > > > the
> > > > > > > > > > > following
> > > > > > > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Enable user to
> > > > > configure
> > > > > > > how
> > > > > > > > > many
> > > > > > > > > > > GPUs
> > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > forward such
> > > > > requirements to
> > > > > > > the
> > > > > > > > > > > external
> > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > Kubernetes/Yarn/Mesos
> > > > > > > setups).
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Provide
> > information
> > > > of
> > > > > > > > > available
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes
> > proposed in
> > > > > the
> > > > > > > FLIP
> > > > > > > > > > are
> > > > > > > > > > > as
> > > > > > > > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU
> > resource
> > > > > > > > > requirements
> > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce
> > GPUManager
> > > > as
> > > > > > > one of
> > > > > > > > > > the
> > > > > > > > > > > task
> > > > > > > > > > > > > > > > manager
> > > > > > > > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and expose GPU
> > resource
> > > > > > > > > information
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce the
> > default
> > > > > > > script
> > > > > > > > > for
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > discovery,
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the privilege mode
> > to
> > > > > help
> > > > > > > user
> > > > > > > > > to
> > > > > > > > > > > > > achieve
> > > > > > > > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > standalone mode.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find more
> > > > details
> > > > > in
> > > > > > > the
> > > > > > > > > > FLIP
> > > > > > > > > > > wiki
> > > > > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > >
> > > >
> >

Stephan Ewen

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Sounds good!

On Tue, Mar 31, 2020 at 4:32 AM Yangze Guo <[hidden email]> wrote:

> Hi everyone,
> I've updated the FLIP accordingly. The key change is replacing two
> resource allocation interfaces to config options.
>
> If there are no further comments, I would like to start a voting
> thread by tomorrow.
>
> Best,
> Yangze Guo
>
> On Mon, Mar 30, 2020 at 9:15 PM Till Rohrmann <[hidden email]>
> wrote:
> >
> > If there is no need for the ExternalResourceDriver on the RM side, then
> it
> > is always a good idea to keep it simple and don't introduce it. One can
> > always change things once one realizes that there is a need for it.
> >
> > Cheers,
> > Till
> >
> > On Mon, Mar 30, 2020 at 12:00 PM Yangze Guo <[hidden email]> wrote:
> >
> > > Hi @Till, @Xintong
> > >
> > > I think even without the credential concerns, replacing the interfaces
> > > with configuration options is a good idea from my side.
> > > - Currently, I don't see any external resource does not compatible
> > > with this mechanism
> > > - It reduces the burden of users to implement a plugin themselves.
> > > WDYT?
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Mon, Mar 30, 2020 at 5:44 PM Xintong Song <[hidden email]>
> > > wrote:
> > > >
> > > > I also agree that the pluggable ExternalResourceDriver should be
> loaded
> > > by
> > > > the cluster class loader. Despite the plugin might be implemented by
> > > users,
> > > > external resources (as part of task executor resources) should be
> cluster
> > > > configurations, unlike job-level user codes such as UDFs, because the
> > > task
> > > > executors belongs to the cluster rather than jobs.
> > > >
> > > >
> > > > IIUC, the concern Stephan raised is about the potential credential
> > > problem
> > > > when executing user codes on RM with cluster class loader. The
> concern
> > > > makes sense to me, and I think what Yangze suggested should be a good
> > > > approach trying to prevent such credential problems. The only
> purpose we
> > > > tried to execute user codes (i.e.
> getKubernetes/YarnExternalResource) on
> > > RM
> > > > was that, we need to set these key-value pairs to pod/container
> requests.
> > > > Replacing the interfaces getKubernetes/YarnExternalResource with
> > > > configuration options
> > > > 'external-resource.{resourceName}.yarn/kubernetes.key/amount',
> > > > we can still fulfill that purpose, without the credential risks.
> > > >
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Mon, Mar 30, 2020 at 5:17 PM Till Rohrmann <[hidden email]>
> > > wrote:
> > > >
> > > > > At the moment the RM does not have a user code class loader and I
> agree
> > > > > with Stephan that it should stay like this. This, however, does not
> > > mean
> > > > > that we cannot support pluggable components in the RM. As long as
> the
> > > > > plugins are on the system's class path, it should be fine for the
> RM to
> > > > > load them. For example, we could add external resources via Flink's
> > > plugin
> > > > > mechanism or something similar.
> > > > >
> > > > > A very simple implementation of such an ExternalResourceDriver
> could
> > > be a
> > > > > class which simply returns what is written in the flink-conf.yaml
> > > under a
> > > > > given key.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Mon, Mar 30, 2020 at 5:39 AM Yangze Guo <[hidden email]>
> wrote:
> > > > >
> > > > > > Hi, Stephan,
> > > > > >
> > > > > > I see your concern and I totally agree with you.
> > > > > >
> > > > > > The interface on RM side is now `Map<String key, String/Long
> value>
> > > > > > getYarn/KubernetesExternalResource()`. The only valid
> information RM
> > > > > > get from it is the configuration key of that external resource in
> > > > > > Yarn/K8s. The "String/Long value" would be the same as the
> > > > > > external-resource.{resourceName}.amount.
> > > > > > So, I think it makes sense to replace these two interfaces with
> two
> > > > > > configs, i.e.
> external-resource.{resourceName}.yarn/kubernetes.key.
> > > We
> > > > > > may lose some extensibility, but AFAIK it could work with common
> > > > > > external resources like GPU, FPGA. WDYT?
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > > > On Fri, Mar 27, 2020 at 7:59 PM Stephan Ewen <[hidden email]>
> > > wrote:
> > > > > > >
> > > > > > > Maybe one final comment: It is probably not an issue, but let's
> > > try and
> > > > > > > keep user code (via user code classloader) out of the
> > > ResourceManager,
> > > > > if
> > > > > > > possible.
> > > > > > >
> > > > > > > As background:
> > > > > > >
> > > > > > > There were thoughts in the past to support setups where the RM
> > > must run
> > > > > > > with "superuser" credentials, but we cannot run JM/TM with
> these
> > > > > > > credentials, as the user code might access them otherwise.
> > > > > > > This is actually possible today, you can run the RM in a
> different
> > > JVM
> > > > > or
> > > > > > > in a different container, and give it more credentials than
> JMs /
> > > TMs.
> > > > > > But
> > > > > > > for this to be feasible, we cannot allow any user-defined code
> to
> > > be in
> > > > > > the
> > > > > > > JVM, because that instantaneously breaks the isolation of
> > > credentials.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]
> >
> > > wrote:
> > > > > > >
> > > > > > > > Thanks for the feedback, @Till and @Xintong.
> > > > > > > >
> > > > > > > > Regarding separating the interface, I'm also +1 with it.
> > > > > > > >
> > > > > > > > Regarding the resource allocation interface, true, it's
> > > dangerous to
> > > > > > > > give much access to user codes. Changing the return type to
> > > > > Map<String
> > > > > > > > key, String/Long value> makes sense to me. AFAIK, it is
> > > compatible
> > > > > > > > with all the first-party supported resources for
> > > Yarn/Kubernetes. It
> > > > > > > > could also free us from the potential dependency issue as
> well.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > > On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <
> > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Thanks for updating the FLIP, Yangze.
> > > > > > > > >
> > > > > > > > > I agree with Till that we probably want to separate the
> > > K8s/Yarn
> > > > > > > > decorator
> > > > > > > > > calls. Users can still configure one driver class, and we
> can
> > > use
> > > > > > > > > `instanceof` to check whether the driver implemented
> K8s/Yarn
> > > > > > specific
> > > > > > > > > interfaces.
> > > > > > > > >
> > > > > > > > > Moreover, I'm not sure about exposing entire
> > > `ContainerRequest` /
> > > > > > `Pod`
> > > > > > > > > (`AbstractKubernetesStepDecorator` directly manipulates on
> > > `Pod`)
> > > > > to
> > > > > > user
> > > > > > > > > codes. It gives more access to user codes than needed for
> > > defining
> > > > > > > > external
> > > > > > > > > resource, which might cause problems. Instead, I would
> suggest
> > > to
> > > > > > have
> > > > > > > > > interface like `Map<String key, String value>
> > > > > > > > > getYarn/KubernetesExternalResource()` and assemble them
> into
> > > > > > > > > `ContainerRequest` / `Pod` in
> Yarn/KubernetesResourceManager.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <
> > > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > I'm a bit late to the party. I think the current proposal
> > > looks
> > > > > > good.
> > > > > > > > > >
> > > > > > > > > > Concerning the ExternalResourceDriver interface defined
> in
> > > the
> > > > > FLIP
> > > > > > > > [1], I
> > > > > > > > > > would suggest to not include the decorator calls for
> > > Kubernetes
> > > > > and
> > > > > > > > Yarn in
> > > > > > > > > > the base interface. Instead I would suggest to segregate
> the
> > > > > > deployment
> > > > > > > > > > specific decorator calls into separate interfaces. That
> way
> > > an
> > > > > > > > > > ExternalResourceDriver does not have to support all
> > > deployments
> > > > > > from
> > > > > > > > the
> > > > > > > > > > very beginning. Moreover, some resources might not be
> > > supported
> > > > > by
> > > > > > a
> > > > > > > > > > specific deployment target and the natural way to express
> > > this
> > > > > > would
> > > > > > > > be to
> > > > > > > > > > not implement the respective deployment specific
> interface.
> > > > > > > > > >
> > > > > > > > > > Moreover, having void
> > > > > > > > > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> > > > > > > > containerRequest)
> > > > > > > > > > in the ExternalResourceDriver interface would require
> Hadoop
> > > on
> > > > > > Flink's
> > > > > > > > > > classpath whenever the external resource driver is being
> > > used.
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Till
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <
> > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Nice, thanks a lot!
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <
> > > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for the suggestion, @Stephan, @Becket and
> > > @Xintong.
> > > > > > > > > > > >
> > > > > > > > > > > > I've updated the FLIP accordingly. I do not add a
> > > > > > > > > > > > ResourceInfoProvider. Instead, I introduce the
> > > > > > > > ExternalResourceDriver,
> > > > > > > > > > > > which takes the responsibility of all relevant
> > > operations on
> > > > > > both
> > > > > > > > RM
> > > > > > > > > > > > and TM sides.
> > > > > > > > > > > > After a rethink about decoupling the management of
> > > external
> > > > > > > > resources
> > > > > > > > > > > > from TaskExecutor, I think we could do the same
> thing on
> > > the
> > > > > > > > > > > > ResourceManager side. We do not need to add a
> specific
> > > > > > allocation
> > > > > > > > > > > > logic to the ResourceManager each time we add a
> specific
> > > > > > external
> > > > > > > > > > > > resource.
> > > > > > > > > > > > - For Yarn, we need the ExternalResourceDriver to
> edit
> > > the
> > > > > > > > > > > > containerRequest.
> > > > > > > > > > > > - For Kubenetes, ExternalResourceDriver could
> provide a
> > > > > > decorator
> > > > > > > > for
> > > > > > > > > > > > the TM pod.
> > > > > > > > > > > >
> > > > > > > > > > > > In this way, just like MetricReporter, we allow
> users to
> > > > > define
> > > > > > > > their
> > > > > > > > > > > > custom ExternalResourceDriver. It is more extensible
> and
> > > fits
> > > > > > the
> > > > > > > > > > > > separation of concerns. For more details, please
> take a
> > > look
> > > > > at
> > > > > > > > [1].
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > This sounds good to go ahead from my side.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I like the approach that Becket suggested - in that
> > > case
> > > > > the
> > > > > > core
> > > > > > > > > > > > > abstraction that everyone would need to understand
> > > would be
> > > > > > > > "external
> > > > > > > > > > > > > resource allocation" and the
> "ResourceInfoProvider",
> > > and
> > > > > the
> > > > > > GPU
> > > > > > > > > > > specific
> > > > > > > > > > > > > code would be a specific implementation only known
> to
> > > that
> > > > > > > > component
> > > > > > > > > > > that
> > > > > > > > > > > > > allocates the external resource. That fits the
> > > separation
> > > > > of
> > > > > > > > concerns
> > > > > > > > > > > > well.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I also understand that it should not be
> > > over-engineered in
> > > > > > the
> > > > > > > > first
> > > > > > > > > > > > > version, so some simplification makes sense, and
> then
> > > > > > gradually
> > > > > > > > > > expand
> > > > > > > > > > > > from
> > > > > > > > > > > > > there.
> > > > > > > > > > > > >
> > > > > > > > > > > > > So +1 to go ahead with what was suggested above
> > > (Xintong /
> > > > > > > > Becket)
> > > > > > > > > > from
> > > > > > > > > > > > my
> > > > > > > > > > > > > side.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> > > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > @Stephan
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I see your concern, and I completely agree with
> you
> > > that
> > > > > we
> > > > > > > > should
> > > > > > > > > > > > first
> > > > > > > > > > > > > > think about the "library" / "plugin" /
> "extension"
> > > style
> > > > > if
> > > > > > > > > > possible.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If GPUs are sliced and assigned during
> scheduling,
> > > there
> > > > > > may be
> > > > > > > > > > > reason,
> > > > > > > > > > > > > > > although it looks that it would belong to the
> slot
> > > > > then.
> > > > > > Is
> > > > > > > > that
> > > > > > > > > > > > what we
> > > > > > > > > > > > > > > are doing here?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > In the current proposal, we do not have the GPUs
> > > sliced
> > > > > and
> > > > > > > > > > assigned
> > > > > > > > > > > to
> > > > > > > > > > > > > > slots, because it could be problematic without
> > > dynamic
> > > > > slot
> > > > > > > > > > > allocation.
> > > > > > > > > > > > > > E.g., the number of GPUs might not be evenly
> > > divisible by
> > > > > > the
> > > > > > > > > > number
> > > > > > > > > > > of
> > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think it makes sense to eventually have the
> GPUs
> > > > > > assigned to
> > > > > > > > > > slots.
> > > > > > > > > > > > Even
> > > > > > > > > > > > > > then, we might still need a TM level GPUManager
> (or
> > > > > > > > > > ResourceProvider
> > > > > > > > > > > > like
> > > > > > > > > > > > > > Becket suggested). For memory, in each slot we
> can
> > > simply
> > > > > > > > request
> > > > > > > > > > the
> > > > > > > > > > > > > > amount of memory, leaving it to JVM / OS to
> decide
> > > which
> > > > > > memory
> > > > > > > > > > > > (address)
> > > > > > > > > > > > > > should be assigned. For GPU, and potentially
> other
> > > > > > resources
> > > > > > > > like
> > > > > > > > > > > > FPGA, we
> > > > > > > > > > > > > > need to explicitly specify which GPU (index)
> should
> > > be
> > > > > > used.
> > > > > > > > > > > > Therefore, we
> > > > > > > > > > > > > > need some component at the TM level to coordinate
> > > which
> > > > > > slot
> > > > > > > > uses
> > > > > > > > > > > which
> > > > > > > > > > > > > > GPU.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IMO, unless we say Flink will not support
> slot-level
> > > GPU
> > > > > > > > slicing at
> > > > > > > > > > > > least
> > > > > > > > > > > > > > in the foreseeable future, I don't see a good
> way to
> > > > > avoid
> > > > > > > > touching
> > > > > > > > > > > > the TM
> > > > > > > > > > > > > > core. To that end, I think Becket's suggestion
> > > points to
> > > > > a
> > > > > > good
> > > > > > > > > > > > direction,
> > > > > > > > > > > > > > that supports more features (GPU, FPGA, etc.)
> with
> > > less
> > > > > > > > coupling to
> > > > > > > > > > > > the TM
> > > > > > > > > > > > > > core (only needs to understand the general
> > > interfaces).
> > > > > The
> > > > > > > > > > detailed
> > > > > > > > > > > > > > implementation for specific resource types can
> even
> > > be
> > > > > > > > encapsulated
> > > > > > > > > > > as
> > > > > > > > > > > > a
> > > > > > > > > > > > > > library.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for sharing your thought on the final
> state.
> > > > > > Despite the
> > > > > > > > > > > > details how
> > > > > > > > > > > > > > the interfaces should look like, I think this is
> a
> > > really
> > > > > > good
> > > > > > > > > > > > abstraction
> > > > > > > > > > > > > > for supporting general resource types.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'd like to further clarify that, the following
> three
> > > > > > things
> > > > > > > > are
> > > > > > > > > > all
> > > > > > > > > > > > that
> > > > > > > > > > > > > > the "Flink core" needs to understand.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - The *amount* of resource, for scheduling.
> > > Actually,
> > > > > we
> > > > > > > > already
> > > > > > > > > > > > have
> > > > > > > > > > > > > > the Resource class in ResourceProfile and
> > > ResourceSpec
> > > > > > for
> > > > > > > > > > > extended
> > > > > > > > > > > > > > resource. It's just not really used.
> > > > > > > > > > > > > > - The *info*, that Flink provides to the
> > > operators /
> > > > > > user
> > > > > > > > codes.
> > > > > > > > > > > > > > - The *provider*, which generates the info
> based
> > > on
> > > > > the
> > > > > > > > amount.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The "core" does not need to understand the
> specific
> > > > > > > > implementation
> > > > > > > > > > > > details
> > > > > > > > > > > > > > of the above three. They can even be implemented
> in a
> > > > > > 3rd-party
> > > > > > > > > > > > library.
> > > > > > > > > > > > > > Similar to how we allow users to define their
> custom
> > > > > > > > > > MetricReporter.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> > > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the comment, Stephan.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - If everything becomes a "core feature", it
> will
> > > > > make
> > > > > > the
> > > > > > > > > > > project
> > > > > > > > > > > > hard
> > > > > > > > > > > > > > > > to develop in the future. Thinking "library"
> /
> > > > > > "plugin" /
> > > > > > > > > > > > "extension"
> > > > > > > > > > > > > > > style
> > > > > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Completely agree. It is much more important to
> > > design a
> > > > > > > > mechanism
> > > > > > > > > > > > than
> > > > > > > > > > > > > > > focusing on a specific case. Here is what I am
> > > thinking
> > > > > > to
> > > > > > > > fully
> > > > > > > > > > > > support
> > > > > > > > > > > > > > > custom resource management:
> > > > > > > > > > > > > > > 1. On the JM / RM side, use ResourceProfile and
> > > > > > ResourceSpec
> > > > > > > > to
> > > > > > > > > > > > define
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > resource and the amount required. They will be
> > > used to
> > > > > > find
> > > > > > > > > > > suitable
> > > > > > > > > > > > TMs
> > > > > > > > > > > > > > > slots to run the tasks. At this point, the
> > > resources
> > > > > are
> > > > > > only
> > > > > > > > > > > > measured by
> > > > > > > > > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. On the TM side, have something like
> > > > > > > > *"ResourceInfoProvider"*
> > > > > > > > > > to
> > > > > > > > > > > > > > identify
> > > > > > > > > > > > > > > and provides the detail information of the
> > > individual
> > > > > > > > resource,
> > > > > > > > > > > e.g.
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > ID.. It is important because the operator may
> have
> > > to
> > > > > > > > explicitly
> > > > > > > > > > > > interact
> > > > > > > > > > > > > > > with the physical resource it uses. The
> > > > > > ResourceInfoProvider
> > > > > > > > > > might
> > > > > > > > > > > > look
> > > > > > > > > > > > > > > like something below.
> > > > > > > > > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > > > > > > > > Map<AbstractID, INFO>
> > > > > retrieveResourceInfo(OperatorId
> > > > > > > > opId,
> > > > > > > > > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > > > > > > > > }
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - There could be several
> "*ResourceInfoProvider*"
> > > > > > configured
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > > TM to
> > > > > > > > > > > > > > > retrieve the information for different
> resources.
> > > > > > > > > > > > > > > - The TM will be responsible to assign those
> > > individual
> > > > > > > > resources
> > > > > > > > > > > to
> > > > > > > > > > > > each
> > > > > > > > > > > > > > > operator according to their requested amount.
> > > > > > > > > > > > > > > - The operators will be able to get the
> > > ResourceInfo
> > > > > from
> > > > > > > > their
> > > > > > > > > > > > > > > RuntimeContext.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If we agree this is a reasonable final state.
> We
> > > can
> > > > > > adapt
> > > > > > > > the
> > > > > > > > > > > > current
> > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > to it. In fact it does not sound a big change
> to
> > > me.
> > > > > All
> > > > > > the
> > > > > > > > > > > proposed
> > > > > > > > > > > > > > > configuration can be as is, it is just that
> Flink
> > > > > itself
> > > > > > > > won't
> > > > > > > > > > care
> > > > > > > > > > > > about
> > > > > > > > > > > > > > > them, instead a GPUInfoProviver implementing
> the
> > > > > > > > > > > ResourceInfoProvider
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > use them.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> > > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi all!
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The main point I wanted to throw into the
> > > discussion
> > > > > > is the
> > > > > > > > > > > > following:
> > > > > > > > > > > > > > > > - With more and more use cases, more and
> more
> > > tools
> > > > > > go
> > > > > > > > into
> > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > - If everything becomes a "core feature",
> it
> > > will
> > > > > > make
> > > > > > > > the
> > > > > > > > > > > > project
> > > > > > > > > > > > > > hard
> > > > > > > > > > > > > > > > to develop in the future. Thinking "library"
> /
> > > > > > "plugin" /
> > > > > > > > > > > > "extension"
> > > > > > > > > > > > > > > style
> > > > > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > - A good thought experiment is always: How
> many
> > > > > > future
> > > > > > > > > > > developers
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > interact with this code (and possibly
> understand
> > > it
> > > > > > > > partially),
> > > > > > > > > > > > even if
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > features they touch have nothing to do with
> GPU
> > > > > > support. If
> > > > > > > > > > many
> > > > > > > > > > > > > > > > contributors to unrelated features will have
> to
> > > touch
> > > > > > it
> > > > > > > > and
> > > > > > > > > > > > understand
> > > > > > > > > > > > > > > it,
> > > > > > > > > > > > > > > > then let's think if there is a different
> > > solution.
> > > > > > Maybe
> > > > > > > > there
> > > > > > > > > > is
> > > > > > > > > > > > not,
> > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > then we should be sure why.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > - That led me to raising this issue: If
> the GPU
> > > > > > manager
> > > > > > > > > > > becomes a
> > > > > > > > > > > > > > core
> > > > > > > > > > > > > > > > service in the TaskManager, Environment,
> > > > > > RuntimeContext,
> > > > > > > > etc.
> > > > > > > > > > > then
> > > > > > > > > > > > > > > everyone
> > > > > > > > > > > > > > > > developing TM and streaming tasks need to
> > > understand
> > > > > > the
> > > > > > > > GPU
> > > > > > > > > > > > manager.
> > > > > > > > > > > > > > > That
> > > > > > > > > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Access to configuration seems not the right
> > > reason to
> > > > > > do
> > > > > > > > that.
> > > > > > > > > > We
> > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > expose the Flink configuration from the
> > > > > RuntimeContext
> > > > > > > > anyways.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If GPUs are sliced and assigned during
> > > scheduling,
> > > > > > there
> > > > > > > > may be
> > > > > > > > > > > > reason,
> > > > > > > > > > > > > > > > although it looks that it would belong to the
> > > slot
> > > > > > then. Is
> > > > > > > > > > that
> > > > > > > > > > > > what
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > are doing here?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song
> <
> > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > IMO, eventually an operator should only see
> > > info of
> > > > > > GPUs
> > > > > > > > that
> > > > > > > > > > > are
> > > > > > > > > > > > > > > > dedicated
> > > > > > > > > > > > > > > > > for it, instead of all GPUs on the
> > > > > machine/container
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > > current
> > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > It does not make sense to let the user who
> > > writes a
> > > > > > UDF
> > > > > > > > to
> > > > > > > > > > > worry
> > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > coordination among multiple operators
> running
> > > on
> > > > > the
> > > > > > same
> > > > > > > > > > > > machine.
> > > > > > > > > > > > > > And
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > we want to limit the GPU info an operator
> > > sees, we
> > > > > > > > should not
> > > > > > > > > > > > let the
> > > > > > > > > > > > > > > > > operator to instantiate GPUManager, which
> > > means we
> > > > > > have
> > > > > > > > to
> > > > > > > > > > > expose
> > > > > > > > > > > > > > > > something
> > > > > > > > > > > > > > > > > through runtime context, either GPU info or
> > > some
> > > > > > kind of
> > > > > > > > > > > limited
> > > > > > > > > > > > > > access
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > the GPUManager.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin
> <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > It probably make sense for us to first
> agree
> > > on
> > > > > the
> > > > > > > > final
> > > > > > > > > > > > state.
> > > > > > > > > > > > > > More
> > > > > > > > > > > > > > > > > > specifically, will the resource info be
> > > exposed
> > > > > > through
> > > > > > > > > > > runtime
> > > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > > > eventually?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > If that is the final state and we have a
> > > seamless
> > > > > > > > migration
> > > > > > > > > > > > story
> > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > FLIP to that final state, Personally I
> think
> > > it
> > > > > is
> > > > > > OK
> > > > > > > > to
> > > > > > > > > > > > expose the
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > info in the runtime context.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong
> > > Song <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > > > > > > > > I think what Stephan means (@Stephan,
> > > please
> > > > > > correct
> > > > > > > > me
> > > > > > > > > > if
> > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > wrong)
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > that, we might not need to hold and
> > > maintain
> > > > > the
> > > > > > > > > > GPUManager
> > > > > > > > > > > > as a
> > > > > > > > > > > > > > > > > service
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > TaskManagerServices or RuntimeContext.
> An
> > > > > > > > alternative is
> > > > > > > > > > to
> > > > > > > > > > > > > > create
> > > > > > > > > > > > > > > /
> > > > > > > > > > > > > > > > > > > retrieve the GPUManager only in the
> > > operators
> > > > > > that
> > > > > > > > need
> > > > > > > > > > it,
> > > > > > > > > > > > e.g.,
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > > > > > > > > I agree with you on excluding
> GPUManager
> > > from
> > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > - For the first step, where we
> provide
> > > > > unified
> > > > > > > > > > TM-level
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > > to all operators, it should be fine
> to
> > > have
> > > > > > > > operators
> > > > > > > > > > > > access /
> > > > > > > > > > > > > > > > > > > lazy-initiate GPUManager by
> themselves.
> > > > > > > > > > > > > > > > > > > - In future, we might have some more
> > > > > > fine-grained
> > > > > > > > GPU
> > > > > > > > > > > > > > > management,
> > > > > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > > > we need to maintain GPUManager as a
> > > service
> > > > > > and
> > > > > > > > put
> > > > > > > > > > GPU
> > > > > > > > > > > > info
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > slot
> > > > > > > > > > > > > > > > > > > profiles. But at least for now it's
> not
> > > > > > necessary
> > > > > > > > to
> > > > > > > > > > > > introduce
> > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > complexity.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > However, I have some concerns on
> excluding
> > > > > > GPUManager
> > > > > > > > > > from
> > > > > > > > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > - Configurations needed for
> creating the
> > > > > > > > GPUManager is
> > > > > > > > > > > not
> > > > > > > > > > > > > > > always
> > > > > > > > > > > > > > > > > > > available for operators.
> > > > > > > > > > > > > > > > > > > - If later we want to have
> fine-grained
> > > > > > control
> > > > > > > > over
> > > > > > > > > > GPU
> > > > > > > > > > > > > > (e.g.,
> > > > > > > > > > > > > > > > > > > operators in each slot can only see
> GPUs
> > > > > > reserved
> > > > > > > > for
> > > > > > > > > > > that
> > > > > > > > > > > > > > > slot),
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I would suggest to wrap the GPUManager
> > > behind
> > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > expose the GPUInfo to users. For now,
> we
> > > can
> > > > > > declare
> > > > > > > > a
> > > > > > > > > > > method
> > > > > > > > > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with
> a
> > > > > default
> > > > > > > > > > definition
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > calls
> > > > > > > > > > > > > > > > > > > `GPUManager.get()` to get the
> > > lazily-created
> > > > > > > > GPUManager.
> > > > > > > > > > If
> > > > > > > > > > > > later
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > > > > to create / retrieve GPUManager in a
> > > different
> > > > > > way,
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > > > how `getGPUInfo` is implemented,
> without
> > > > > needing
> > > > > > to
> > > > > > > > > > change
> > > > > > > > > > > > any
> > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze
> > > Guo <
> > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > > > > > > > > Do you mean Minicluster? Yes, it
> makes
> > > sense
> > > > > to
> > > > > > > > share
> > > > > > > > > > the
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > Manager
> > > > > > > > > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > > > > > > > > If that's what you worry about, I'm
> +1
> > > for
> > > > > > holding
> > > > > > > > > > > > > > > > > > > > GPUManager(ExternalResourceManagers)
> in
> > > > > > > > TaskExecutor
> > > > > > > > > > > > instead of
> > > > > > > > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Regarding the
> > > RuntimeContext/FunctionContext,
> > > > > > it
> > > > > > > > just
> > > > > > > > > > > > holds the
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > info instead of the GPU Manager.
> AFAIK,
> > > it's
> > > > > > the
> > > > > > > > only
> > > > > > > > > > > > place we
> > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > pass GPU info to the
> > > > > > > > RichFunction/UserDefinedFunction.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac
> > > > > Godfried
> > > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20
> +0000
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > wrote
> > > > > > > > > > > > > > > > > ----
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Can we somehow keep this out
> of the
> > > > > > > > TaskManager
> > > > > > > > > > > > services
> > > > > > > > > > > > > > > > > > > > > > I fear that we could not. IMO,
> the
> > > > > > > > GPUManager(or
> > > > > > > > > > > > > > > > > > > > > > ExternalServicesManagers in
> future)
> > > is
> > > > > > > > conceptually
> > > > > > > > > > > > one of
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > > > > manager services, just like
> > > MemoryManager
> > > > > > > > before
> > > > > > > > > > > 1.10.
> > > > > > > > > > > > > > > > > > > > > > - It maintains/holds the GPU
> > > resource at
> > > > > TM
> > > > > > > > level
> > > > > > > > > > and
> > > > > > > > > > > > all
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > operators allocate the GPU
> resources
> > > from
> > > > > > it.
> > > > > > > > So,
> > > > > > > > > > it
> > > > > > > > > > > > should
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > exclusive to a single
> TaskExecutor.
> > > > > > > > > > > > > > > > > > > > > > - We could add a collection
> called
> > > > > > > > > > > > ExternalResourceManagers
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > hold
> > > > > > > > > > > > > > > > > > > > > > all managers of other external
> > > resources
> > > > > > in the
> > > > > > > > > > > future.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Can you help me understand why this
> > > needs
> > > > > the
> > > > > > > > > > addition
> > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > > > > > > > > Are you worried about the case when
> > > > > multiple
> > > > > > Task
> > > > > > > > > > > > Executors
> > > > > > > > > > > > > > run
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > JVM? That's not common, but
> wouldn't it
> > > > > > actually
> > > > > > > > be
> > > > > > > > > > > good
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > share the GPU Manager, given that
> the
> > > GPU
> > > > > is
> > > > > > > > shared?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > What parts need information about
> > > this?
> > > > > > > > > > > > > > > > > > > > > > In this FLIP, operators need the
> > > > > > information.
> > > > > > > > Thus,
> > > > > > > > > > > we
> > > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > information to the
> > > > > > > > RuntimeContext/FunctionContext.
> > > > > > > > > > > The
> > > > > > > > > > > > slot
> > > > > > > > > > > > > > > > > profile
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > not aware of GPU resources as
> GPU is
> > > TM
> > > > > > level
> > > > > > > > > > > resource
> > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Can the GPU Manager be a "self
> > > > > contained"
> > > > > > > > thing
> > > > > > > > > > > that
> > > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > configuration, and then abstracts
> > > > > > everything
> > > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > > > > > Yes, we just pass the path/args
> of
> > > the
> > > > > > discover
> > > > > > > > > > > script
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > > GPUs per TM to it. It takes the
> > > > > > responsibility
> > > > > > > > to
> > > > > > > > > > get
> > > > > > > > > > > > the
> > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > information and expose them to
> the
> > > > > > > > > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > Operators. Meanwhile, we'd
> better not
> > > > > allow
> > > > > > > > > > operators
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > directly
> > > > > > > > > > > > > > > > > > > > > > access GPUManager, it should get
> what
> > > > > they
> > > > > > want
> > > > > > > > > > from
> > > > > > > > > > > > > > Context.
> > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > > > then decouple the
> > > > > interface/implementation
> > > > > > of
> > > > > > > > > > > > GPUManager
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > Public
> > > > > > > > > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM
> > > Stephan
> > > > > > Ewen <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > It sounds fine to initially
> start
> > > with
> > > > > > GPU
> > > > > > > > > > specific
> > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > > > generalizing this once we
> better
> > > > > > understand
> > > > > > > > the
> > > > > > > > > > > > space.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > About the implementation
> suggested
> > > in
> > > > > > > > FLIP-108:
> > > > > > > > > > > > > > > > > > > > > > > - Can we somehow keep this out
> of
> > > the
> > > > > > > > TaskManager
> > > > > > > > > > > > > > services?
> > > > > > > > > > > > > > > > > > > Anything
> > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > have to pull through all
> layers of
> > > the
> > > > > TM
> > > > > > > > makes
> > > > > > > > > > the
> > > > > > > > > > > > TM
> > > > > > > > > > > > > > > > > components
> > > > > > > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > - What parts need information
> about
> > > > > this?
> > > > > > > > > > > > > > > > > > > > > > > -> do the slot profiles need
> > > > > information
> > > > > > > > about
> > > > > > > > > > the
> > > > > > > > > > > > GPU?
> > > > > > > > > > > > > > > > > > > > > > > -> Can the GPU Manager be a
> "self
> > > > > > contained"
> > > > > > > > > > thing
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > > > > > > the configuration, and then
> > > abstracts
> > > > > > > > everything
> > > > > > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > > > access it via
> "GPUManager.get()"
> > > or so?
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM
> > > Yangze
> > > > > > Guo <
> > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > > > > > > > > Regarding the WebUI and
> GPUInfo,
> > > > > you're
> > > > > > > > right,
> > > > > > > > > > > > I'll add
> > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > > > > > > > > Regarding the general
> extended
> > > > > resource
> > > > > > > > > > > mechanism,
> > > > > > > > > > > > I
> > > > > > > > > > > > > > > second
> > > > > > > > > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > > > > > > > - It's better to leverage
> > > > > > ResourceProfile
> > > > > > > > and
> > > > > > > > > > > > > > > ResourceSpec
> > > > > > > > > > > > > > > > > > after
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > > supporting fine-grained GPU
> > > > > > scheduling. As
> > > > > > > > a
> > > > > > > > > > > first
> > > > > > > > > > > > step
> > > > > > > > > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > > > > > > > > prefer to not include it in
> the
> > > scope
> > > > > > of
> > > > > > > > this
> > > > > > > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > > - Regarding the "Extended
> > > Resource
> > > > > > > > Manager",
> > > > > > > > > > if I
> > > > > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > > > > > > > > > correctly, it just a code
> > > refactoring
> > > > > > atm,
> > > > > > > > we
> > > > > > > > > > > could
> > > > > > > > > > > > > > > extract
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > >
> > > open/close/allocateExtendResources of
> > > > > > > > > > GPUManager
> > > > > > > > > > > to
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > > > > > > > > that is the case, +1 to do it
> > > during
> > > > > > > > > > > > implementation.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > > > > > > > > As Xintong said, we looked
> into
> > > how
> > > > > > Spark
> > > > > > > > > > > supports
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > > > Resource Scheduling" before
> and
> > > > > > decided to
> > > > > > > > > > > > introduce a
> > > > > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > >
> > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > > > > > > > > to make it more extensible. I
> > > think
> > > > > the
> > > > > > > > > > > "resource"
> > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > proper
> > > > > > > > > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > > > > > > > > to contain all the configs of
> > > > > extended
> > > > > > > > > > resources.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48
> AM
> > > > > Xingbo
> > > > > > > > Huang <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP,
> > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > There is no doubt that GPU
> > > resource
> > > > > > > > > > management
> > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > > > > > > > > facilitate the development
> of
> > > > > > AI-related
> > > > > > > > > > > > applications
> > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > I have only one comment
> about
> > > this
> > > > > > wiki:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Regarding the names of
> several
> > > GPU
> > > > > > > > > > > > configurations, I
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > delete the resource field
> > > makes it
> > > > > > > > consistent
> > > > > > > > > > > > with
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > names
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > > > > resource-related
> > > configurations in
> > > > > > > > > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > > > > > > > > ->
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <
> > > > > [hidden email]>
> > > > > > > > > > > > 于2020年3月4日周三
> > > > > > > > > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang
> and I
> > > also
> > > > > > had
> > > > > > > > an
> > > > > > > > > > > > offline
> > > > > > > > > > > > > > > > > discussion
> > > > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > > > > > > > > the "GPU Support" as some
> > > general
> > > > > > > > "Extended
> > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > Support".
> > > > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > > > > > > > > supporting extended
> > > resources in
> > > > > a
> > > > > > > > general
> > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > > > > > > > and extensible way. The
> > > reason we
> > > > > > > > propose
> > > > > > > > > > > this
> > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > narrowing
> > > > > > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > > > > > > > > down to GPU alone, is
> mainly
> > > for
> > > > > > the
> > > > > > > > > > concern
> > > > > > > > > > > on
> > > > > > > > > > > > > > extra
> > > > > > > > > > > > > > > > > > efforts
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > > > > > > > > capacity needed for a
> general
> > > > > > > > mechanism.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > To come up with a well
> > > design on
> > > > > a
> > > > > > > > general
> > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > > > > > mechanism, we would need
> to
> > > > > > investigate
> > > > > > > > > > more
> > > > > > > > > > > > on how
> > > > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > > > > > > > > kind of resources in
> > > practice.
> > > > > For
> > > > > > > > GPU, we
> > > > > > > > > > > > learnt
> > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > experts, Becket and his
> team
> > > > > > members.
> > > > > > > > But
> > > > > > > > > > for
> > > > > > > > > > > > FPGA,
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > > > > > > > > extended resources, we
> don't
> > > have
> > > > > > such
> > > > > > > > > > > > convenient
> > > > > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > > > > > > > > making the investigation
> > > requires
> > > > > > more
> > > > > > > > > > > efforts,
> > > > > > > > > > > > > > > which I
> > > > > > > > > > > > > > > > > > tend
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > On the other hand, we
> also
> > > looked
> > > > > > into
> > > > > > > > how
> > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > supports a
> > > > > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > > > > > Resource Scheduling".
> > > Assuming we
> > > > > > want
> > > > > > > > to
> > > > > > > > > > > have
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > > > resource mechanism in the
> > > future,
> > > > > > we
> > > > > > > > > > believe
> > > > > > > > > > > > that
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > > > > > design can be easily
> > > extended, in
> > > > > > an
> > > > > > > > > > > > incremental
> > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > > > > too
> > > > > > > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > - The most important
> part is
> > > > > > probably
> > > > > > > > user
> > > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > > > > > > > > configuration options to
> > > define
> > > > > the
> > > > > > > > amount,
> > > > > > > > > > > > > > discovery
> > > > > > > > > > > > > > > > > > script
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > > > > > > > > k8s) in a per resource
> type
> > > bias
> > > > > > [1],
> > > > > > > > which
> > > > > > > > > > > is
> > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > > > > proposed in this FLIP. I
> > > think
> > > > > > it's not
> > > > > > > > > > > > necessary
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > > > > > in the general way atm,
> > > since we
> > > > > > do not
> > > > > > > > > > have
> > > > > > > > > > > > > > supports
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > > > > types now. If later we
> > > decided to
> > > > > > have
> > > > > > > > per
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > > > > > > > > can have backwards
> > > compatibility
> > > > > > on the
> > > > > > > > > > > current
> > > > > > > > > > > > > > > > proposed
> > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > > > > > > > > - For the GPU Manager, if
> > > later
> > > > > > needed
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > > > > > > > > Resource Manager" (or
> > > whatever it
> > > > > > is
> > > > > > > > > > called).
> > > > > > > > > > > > That
> > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > > > > > > > > component-internal
> > > refactoring.
> > > > > > > > > > > > > > > > > > > > > > > > > > - For ResourceProfile and
> > > > > > ResourceSpec,
> > > > > > > > > > there
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > > > > > > > > general extended
> resource.
> > > We can
> > > > > > of
> > > > > > > > course
> > > > > > > > > > > > > > leverage
> > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > > > > > > > > fine grained GPU
> scheduling.
> > > That
> > > > > > is
> > > > > > > > also
> > > > > > > > > > not
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > > > > > > > step proposal, and would
> > > require
> > > > > > > > FLIP-56 to
> > > > > > > > > > > be
> > > > > > > > > > > > > > > finished
> > > > > > > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > To summary up, I agree
> with
> > > > > Becket
> > > > > > that
> > > > > > > > > > have
> > > > > > > > > > > a
> > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > general extended resource
> > > > > > mechanism,
> > > > > > > > and
> > > > > > > > > > keep
> > > > > > > > > > > > it in
> > > > > > > > > > > > > > > > mind
> > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > > > > > > > > and implementing the
> current
> > > one.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
> https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at
> 9:18
> > > AM
> > > > > > Becket
> > > > > > > > Qin <
> > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > That's a good point,
> > > Stephan.
> > > > > It
> > > > > > > > makes
> > > > > > > > > > > total
> > > > > > > > > > > > > > sense
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > > resource management to
> > > support
> > > > > > custom
> > > > > > > > > > > > resources.
> > > > > > > > > > > > > > > > Having
> > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > > > > > > > to add new resources by
> > > > > > themselves.
> > > > > > > > The
> > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > > > > involve two different
> > > aspects:
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The custom resource
> type
> > > > > > > > definition.
> > > > > > > > > > It
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > supported
> > > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > > > > resources in
> > > ResourceProfile
> > > > > and
> > > > > > > > > > > > ResourceSpec.
> > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > 2. The custom resource
> > > > > allocation
> > > > > > > > logic,
> > > > > > > > > > > > i.e. how
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > assign
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > > > > > > > > to different tasks,
> > > operators,
> > > > > > and
> > > > > > > > so on.
> > > > > > > > > > > > This
> > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > require
> > > > > > > > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > > > > > > > > a. Subtask level - make
> > > sure
> > > > > the
> > > > > > > > subtasks
> > > > > > > > > > > > are put
> > > > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > > > > > > > > It is done by the
> global
> > > RM and
> > > > > > is
> > > > > > > > not
> > > > > > > > > > > > > > customizable
> > > > > > > > > > > > > > > > > right
> > > > > > > > > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > > > > > > > b. Operator level -
> map the
> > > > > exact
> > > > > > > > > > resource
> > > > > > > > > > > > to the
> > > > > > > > > > > > > > > > > > operators
> > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > > > > > > > > GPU 1 for operator A,
> GPU
> > > 2 for
> > > > > > > > operator
> > > > > > > > > > B.
> > > > > > > > > > > > This
> > > > > > > > > > > > > > > step
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > > > > > > > > the global RM does not
> > > > > > distinguish
> > > > > > > > > > > individual
> > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > > > > > > > > It is true for memory,
> but
> > > not
> > > > > > for
> > > > > > > > GPU.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > The GPU manager is
> > > designed to
> > > > > > do 2.b
> > > > > > > > > > here.
> > > > > > > > > > > > So it
> > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > > > > > > > > physical GPU
> information
> > > and
> > > > > > > > bind/match
> > > > > > > > > > > them
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > > > > > general will fill in
> the
> > > > > missing
> > > > > > > > piece to
> > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > custom
> > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > > > > > > > definition. But I'd
> avoid
> > > > > > calling it
> > > > > > > > a
> > > > > > > > > > > > "External
> > > > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > > > > > > > > confusion with RM,
> maybe
> > > > > > something
> > > > > > > > like
> > > > > > > > > > > > "Operator
> > > > > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > > > > > > > be more accurate. So
> for
> > > each
> > > > > > > > resource
> > > > > > > > > > type
> > > > > > > > > > > > users
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > > > > > > > > "Operator Resource
> > > Assigner" in
> > > > > > the
> > > > > > > > TM.
> > > > > > > > > > For
> > > > > > > > > > > > > > memory,
> > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > > > > > > > > but for other extended
> > > > > resources,
> > > > > > > > users
> > > > > > > > > > may
> > > > > > > > > > > > need
> > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Personally I think a
> > > pluggable
> > > > > > > > "Operator
> > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > > > > > > > > in this FLIP. But I am
> > > also OK
> > > > > > with
> > > > > > > > > > having
> > > > > > > > > > > > that
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > > > > > > > > the interface between
> the
> > > > > > "Operator
> > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > > > > take a while to settle
> > > down if
> > > > > we
> > > > > > > > want to
> > > > > > > > > > > > make it
> > > > > > > > > > > > > > > > > > generic.
> > > > > > > > > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > > > > > > > > implementation should
> take
> > > this
> > > > > > > > future
> > > > > > > > > > work
> > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > > > > > > > > don't need to break
> > > backwards
> > > > > > > > > > compatibility
> > > > > > > > > > > > once
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at
> > > 12:27 AM
> > > > > > > > Stephan
> > > > > > > > > > > Ewen
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you for writing
> > > this
> > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > I cannot really give
> much
> > > > > input
> > > > > > > > into
> > > > > > > > > > the
> > > > > > > > > > > > > > > mechanics
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > > > > > > > > and GPU allocation,
> as I
> > > have
> > > > > > no
> > > > > > > > > > > experience
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > One thought I had
> when
> > > > > reading
> > > > > > the
> > > > > > > > > > > > proposal is
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as
> an
> > > > > > "External
> > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > Manager",
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > The way I understand
> the
> > > > > > > > > > ResourceProfile
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > It has the advantage
> > > that it
> > > > > > looks
> > > > > > > > more
> > > > > > > > > > > > > > > extensible.
> > > > > > > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Resource, a
> specialized
> > > > > NVIDIA
> > > > > > GPU
> > > > > > > > > > > > Resource,
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > FPGA
> > > > > > > > > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020
> at
> > > 7:57
> > > > > AM
> > > > > > > > Becket
> > > > > > > > > > > Qin <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP
> > > Yangze.
> > > > > > GPU
> > > > > > > > > > > resource
> > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > for machine
> learning
> > > use
> > > > > > cases.
> > > > > > > > > > > Actually
> > > > > > > > > > > > it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > question from the
> > > users who
> > > > > > are
> > > > > > > > > > > > interested in
> > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Some quick
> comments /
> > > > > > questions
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The WebUI /
> REST API
> > > > > > should
> > > > > > > > > > probably
> > > > > > > > > > > > also
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Is the data
> > > structure
> > > > > that
> > > > > > > > holds
> > > > > > > > > > GPU
> > > > > > > > > > > > info
> > > > > > > > > > > > > > > > also a
> > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket)
> Qin
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3,
> 2020 at
> > > > > 10:15
> > > > > > AM
> > > > > > > > > > Xintong
> > > > > > > > > > > > Song
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for
> drafting
> > > the
> > > > > > FLIP
> > > > > > > > and
> > > > > > > > > > > > kicking
> > > > > > > > > > > > > > off
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this
> > > feature.
> > > > > > > > Supporting
> > > > > > > > > > > > using
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > especially for
> the ML
> > > > > > > > scenarios.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've reviewed the
> > > FLIP
> > > > > wiki
> > > > > > > > doc and
> > > > > > > > > > > it
> > > > > > > > > > > > > > looks
> > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > very good first
> step
> > > for
> > > > > > > > Flink's
> > > > > > > > > > GPU
> > > > > > > > > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2,
> 2020
> > > at
> > > > > > 12:06 PM
> > > > > > > > > > > Yangze
> > > > > > > > > > > > Guo
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like
> to
> > > start
> > > > > a
> > > > > > > > > > discussion
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > support in
> > > Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP
> mainly
> > > > > > discusses
> > > > > > > > the
> > > > > > > > > > > > following
> > > > > > > > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Enable user
> to
> > > > > > configure
> > > > > > > > how
> > > > > > > > > > many
> > > > > > > > > > > > GPUs
> > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > forward such
> > > > > > requirements to
> > > > > > > > the
> > > > > > > > > > > > external
> > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > Kubernetes/Yarn/Mesos
> > > > > > > > setups).
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Provide
> > > information
> > > > > of
> > > > > > > > > > available
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes
> > > proposed in
> > > > > > the
> > > > > > > > FLIP
> > > > > > > > > > > are
> > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU
> > > resource
> > > > > > > > > > requirements
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce
> > > GPUManager
> > > > > as
> > > > > > > > one of
> > > > > > > > > > > the
> > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > manager
> > > > > > > > > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and expose GPU
> > > resource
> > > > > > > > > > information
> > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce the
> > > default
> > > > > > > > script
> > > > > > > > > > for
> > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > discovery,
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the privilege
> mode
> > > to
> > > > > > help
> > > > > > > > user
> > > > > > > > > > to
> > > > > > > > > > > > > > achieve
> > > > > > > > > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > standalone
> mode.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find
> more
> > > > > details
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > FLIP
> > > > > > > > > > > > wiki
> > > > > > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
>

Yangze Guo

Re: [DISCUSS] FLIP-108: Add GPU support in Flink

Thank you all for your participation! I'll start voting for this FLIP.

Best,
Yangze Guo

On Wed, Apr 1, 2020 at 4:55 PM Stephan Ewen <[hidden email]> wrote:

>
> Sounds good!
>
> On Tue, Mar 31, 2020 at 4:32 AM Yangze Guo <[hidden email]> wrote:
>
> > Hi everyone,
> > I've updated the FLIP accordingly. The key change is replacing two
> > resource allocation interfaces to config options.
> >
> > If there are no further comments, I would like to start a voting
> > thread by tomorrow.
> >
> > Best,
> > Yangze Guo
> >
> > On Mon, Mar 30, 2020 at 9:15 PM Till Rohrmann <[hidden email]>
> > wrote:
> > >
> > > If there is no need for the ExternalResourceDriver on the RM side, then
> > it
> > > is always a good idea to keep it simple and don't introduce it. One can
> > > always change things once one realizes that there is a need for it.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Mar 30, 2020 at 12:00 PM Yangze Guo <[hidden email]> wrote:
> > >
> > > > Hi @Till, @Xintong
> > > >
> > > > I think even without the credential concerns, replacing the interfaces
> > > > with configuration options is a good idea from my side.
> > > > - Currently, I don't see any external resource does not compatible
> > > > with this mechanism
> > > > - It reduces the burden of users to implement a plugin themselves.
> > > > WDYT?
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Mon, Mar 30, 2020 at 5:44 PM Xintong Song <[hidden email]>
> > > > wrote:
> > > > >
> > > > > I also agree that the pluggable ExternalResourceDriver should be
> > loaded
> > > > by
> > > > > the cluster class loader. Despite the plugin might be implemented by
> > > > users,
> > > > > external resources (as part of task executor resources) should be
> > cluster
> > > > > configurations, unlike job-level user codes such as UDFs, because the
> > > > task
> > > > > executors belongs to the cluster rather than jobs.
> > > > >
> > > > >
> > > > > IIUC, the concern Stephan raised is about the potential credential
> > > > problem
> > > > > when executing user codes on RM with cluster class loader. The
> > concern
> > > > > makes sense to me, and I think what Yangze suggested should be a good
> > > > > approach trying to prevent such credential problems. The only
> > purpose we
> > > > > tried to execute user codes (i.e.
> > getKubernetes/YarnExternalResource) on
> > > > RM
> > > > > was that, we need to set these key-value pairs to pod/container
> > requests.
> > > > > Replacing the interfaces getKubernetes/YarnExternalResource with
> > > > > configuration options
> > > > > 'external-resource.{resourceName}.yarn/kubernetes.key/amount',
> > > > > we can still fulfill that purpose, without the credential risks.
> > > > >
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Mar 30, 2020 at 5:17 PM Till Rohrmann <[hidden email]>
> > > > wrote:
> > > > >
> > > > > > At the moment the RM does not have a user code class loader and I
> > agree
> > > > > > with Stephan that it should stay like this. This, however, does not
> > > > mean
> > > > > > that we cannot support pluggable components in the RM. As long as
> > the
> > > > > > plugins are on the system's class path, it should be fine for the
> > RM to
> > > > > > load them. For example, we could add external resources via Flink's
> > > > plugin
> > > > > > mechanism or something similar.
> > > > > >
> > > > > > A very simple implementation of such an ExternalResourceDriver
> > could
> > > > be a
> > > > > > class which simply returns what is written in the flink-conf.yaml
> > > > under a
> > > > > > given key.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Mon, Mar 30, 2020 at 5:39 AM Yangze Guo <[hidden email]>
> > wrote:
> > > > > >
> > > > > > > Hi, Stephan,
> > > > > > >
> > > > > > > I see your concern and I totally agree with you.
> > > > > > >
> > > > > > > The interface on RM side is now `Map<String key, String/Long
> > value>
> > > > > > > getYarn/KubernetesExternalResource()`. The only valid
> > information RM
> > > > > > > get from it is the configuration key of that external resource in
> > > > > > > Yarn/K8s. The "String/Long value" would be the same as the
> > > > > > > external-resource.{resourceName}.amount.
> > > > > > > So, I think it makes sense to replace these two interfaces with
> > two
> > > > > > > configs, i.e.
> > external-resource.{resourceName}.yarn/kubernetes.key.
> > > > We
> > > > > > > may lose some extensibility, but AFAIK it could work with common
> > > > > > > external resources like GPU, FPGA. WDYT?
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Fri, Mar 27, 2020 at 7:59 PM Stephan Ewen <[hidden email]>
> > > > wrote:
> > > > > > > >
> > > > > > > > Maybe one final comment: It is probably not an issue, but let's
> > > > try and
> > > > > > > > keep user code (via user code classloader) out of the
> > > > ResourceManager,
> > > > > > if
> > > > > > > > possible.
> > > > > > > >
> > > > > > > > As background:
> > > > > > > >
> > > > > > > > There were thoughts in the past to support setups where the RM
> > > > must run
> > > > > > > > with "superuser" credentials, but we cannot run JM/TM with
> > these
> > > > > > > > credentials, as the user code might access them otherwise.
> > > > > > > > This is actually possible today, you can run the RM in a
> > different
> > > > JVM
> > > > > > or
> > > > > > > > in a different container, and give it more credentials than
> > JMs /
> > > > TMs.
> > > > > > > But
> > > > > > > > for this to be feasible, we cannot allow any user-defined code
> > to
> > > > be in
> > > > > > > the
> > > > > > > > JVM, because that instantaneously breaks the isolation of
> > > > credentials.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Mar 27, 2020 at 4:01 AM Yangze Guo <[hidden email]
> > >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for the feedback, @Till and @Xintong.
> > > > > > > > >
> > > > > > > > > Regarding separating the interface, I'm also +1 with it.
> > > > > > > > >
> > > > > > > > > Regarding the resource allocation interface, true, it's
> > > > dangerous to
> > > > > > > > > give much access to user codes. Changing the return type to
> > > > > > Map<String
> > > > > > > > > key, String/Long value> makes sense to me. AFAIK, it is
> > > > compatible
> > > > > > > > > with all the first-party supported resources for
> > > > Yarn/Kubernetes. It
> > > > > > > > > could also free us from the potential dependency issue as
> > well.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Fri, Mar 27, 2020 at 10:42 AM Xintong Song <
> > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Thanks for updating the FLIP, Yangze.
> > > > > > > > > >
> > > > > > > > > > I agree with Till that we probably want to separate the
> > > > K8s/Yarn
> > > > > > > > > decorator
> > > > > > > > > > calls. Users can still configure one driver class, and we
> > can
> > > > use
> > > > > > > > > > `instanceof` to check whether the driver implemented
> > K8s/Yarn
> > > > > > > specific
> > > > > > > > > > interfaces.
> > > > > > > > > >
> > > > > > > > > > Moreover, I'm not sure about exposing entire
> > > > `ContainerRequest` /
> > > > > > > `Pod`
> > > > > > > > > > (`AbstractKubernetesStepDecorator` directly manipulates on
> > > > `Pod`)
> > > > > > to
> > > > > > > user
> > > > > > > > > > codes. It gives more access to user codes than needed for
> > > > defining
> > > > > > > > > external
> > > > > > > > > > resource, which might cause problems. Instead, I would
> > suggest
> > > > to
> > > > > > > have
> > > > > > > > > > interface like `Map<String key, String value>
> > > > > > > > > > getYarn/KubernetesExternalResource()` and assemble them
> > into
> > > > > > > > > > `ContainerRequest` / `Pod` in
> > Yarn/KubernetesResourceManager.
> > > > > > > > > >
> > > > > > > > > > Thank you~
> > > > > > > > > >
> > > > > > > > > > Xintong Song
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 27, 2020 at 1:10 AM Till Rohrmann <
> > > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I'm a bit late to the party. I think the current proposal
> > > > looks
> > > > > > > good.
> > > > > > > > > > >
> > > > > > > > > > > Concerning the ExternalResourceDriver interface defined
> > in
> > > > the
> > > > > > FLIP
> > > > > > > > > [1], I
> > > > > > > > > > > would suggest to not include the decorator calls for
> > > > Kubernetes
> > > > > > and
> > > > > > > > > Yarn in
> > > > > > > > > > > the base interface. Instead I would suggest to segregate
> > the
> > > > > > > deployment
> > > > > > > > > > > specific decorator calls into separate interfaces. That
> > way
> > > > an
> > > > > > > > > > > ExternalResourceDriver does not have to support all
> > > > deployments
> > > > > > > from
> > > > > > > > > the
> > > > > > > > > > > very beginning. Moreover, some resources might not be
> > > > supported
> > > > > > by
> > > > > > > a
> > > > > > > > > > > specific deployment target and the natural way to express
> > > > this
> > > > > > > would
> > > > > > > > > be to
> > > > > > > > > > > not implement the respective deployment specific
> > interface.
> > > > > > > > > > >
> > > > > > > > > > > Moreover, having void
> > > > > > > > > > > addExternalResourceToRequest(AMRMClient.ContainerRequest
> > > > > > > > > containerRequest)
> > > > > > > > > > > in the ExternalResourceDriver interface would require
> > Hadoop
> > > > on
> > > > > > > Flink's
> > > > > > > > > > > classpath whenever the external resource driver is being
> > > > used.
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Till
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 26, 2020 at 12:45 PM Stephan Ewen <
> > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Nice, thanks a lot!
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 26, 2020 at 10:21 AM Yangze Guo <
> > > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the suggestion, @Stephan, @Becket and
> > > > @Xintong.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I've updated the FLIP accordingly. I do not add a
> > > > > > > > > > > > > ResourceInfoProvider. Instead, I introduce the
> > > > > > > > > ExternalResourceDriver,
> > > > > > > > > > > > > which takes the responsibility of all relevant
> > > > operations on
> > > > > > > both
> > > > > > > > > RM
> > > > > > > > > > > > > and TM sides.
> > > > > > > > > > > > > After a rethink about decoupling the management of
> > > > external
> > > > > > > > > resources
> > > > > > > > > > > > > from TaskExecutor, I think we could do the same
> > thing on
> > > > the
> > > > > > > > > > > > > ResourceManager side. We do not need to add a
> > specific
> > > > > > > allocation
> > > > > > > > > > > > > logic to the ResourceManager each time we add a
> > specific
> > > > > > > external
> > > > > > > > > > > > > resource.
> > > > > > > > > > > > > - For Yarn, we need the ExternalResourceDriver to
> > edit
> > > > the
> > > > > > > > > > > > > containerRequest.
> > > > > > > > > > > > > - For Kubenetes, ExternalResourceDriver could
> > provide a
> > > > > > > decorator
> > > > > > > > > for
> > > > > > > > > > > > > the TM pod.
> > > > > > > > > > > > >
> > > > > > > > > > > > > In this way, just like MetricReporter, we allow
> > users to
> > > > > > define
> > > > > > > > > their
> > > > > > > > > > > > > custom ExternalResourceDriver. It is more extensible
> > and
> > > > fits
> > > > > > > the
> > > > > > > > > > > > > separation of concerns. For more details, please
> > take a
> > > > look
> > > > > > at
> > > > > > > > > [1].
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1]
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Mar 25, 2020 at 7:32 PM Stephan Ewen <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > This sounds good to go ahead from my side.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I like the approach that Becket suggested - in that
> > > > case
> > > > > > the
> > > > > > > core
> > > > > > > > > > > > > > abstraction that everyone would need to understand
> > > > would be
> > > > > > > > > "external
> > > > > > > > > > > > > > resource allocation" and the
> > "ResourceInfoProvider",
> > > > and
> > > > > > the
> > > > > > > GPU
> > > > > > > > > > > > specific
> > > > > > > > > > > > > > code would be a specific implementation only known
> > to
> > > > that
> > > > > > > > > component
> > > > > > > > > > > > that
> > > > > > > > > > > > > > allocates the external resource. That fits the
> > > > separation
> > > > > > of
> > > > > > > > > concerns
> > > > > > > > > > > > > well.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I also understand that it should not be
> > > > over-engineered in
> > > > > > > the
> > > > > > > > > first
> > > > > > > > > > > > > > version, so some simplification makes sense, and
> > then
> > > > > > > gradually
> > > > > > > > > > > expand
> > > > > > > > > > > > > from
> > > > > > > > > > > > > > there.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So +1 to go ahead with what was suggested above
> > > > (Xintong /
> > > > > > > > > Becket)
> > > > > > > > > > > from
> > > > > > > > > > > > > my
> > > > > > > > > > > > > > side.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Mar 23, 2020 at 6:55 AM Xintong Song <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the comments, Stephan & Becket.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Stephan
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I see your concern, and I completely agree with
> > you
> > > > that
> > > > > > we
> > > > > > > > > should
> > > > > > > > > > > > > first
> > > > > > > > > > > > > > > think about the "library" / "plugin" /
> > "extension"
> > > > style
> > > > > > if
> > > > > > > > > > > possible.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If GPUs are sliced and assigned during
> > scheduling,
> > > > there
> > > > > > > may be
> > > > > > > > > > > > reason,
> > > > > > > > > > > > > > > > although it looks that it would belong to the
> > slot
> > > > > > then.
> > > > > > > Is
> > > > > > > > > that
> > > > > > > > > > > > > what we
> > > > > > > > > > > > > > > > are doing here?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > In the current proposal, we do not have the GPUs
> > > > sliced
> > > > > > and
> > > > > > > > > > > assigned
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > slots, because it could be problematic without
> > > > dynamic
> > > > > > slot
> > > > > > > > > > > > allocation.
> > > > > > > > > > > > > > > E.g., the number of GPUs might not be evenly
> > > > divisible by
> > > > > > > the
> > > > > > > > > > > number
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think it makes sense to eventually have the
> > GPUs
> > > > > > > assigned to
> > > > > > > > > > > slots.
> > > > > > > > > > > > > Even
> > > > > > > > > > > > > > > then, we might still need a TM level GPUManager
> > (or
> > > > > > > > > > > ResourceProvider
> > > > > > > > > > > > > like
> > > > > > > > > > > > > > > Becket suggested). For memory, in each slot we
> > can
> > > > simply
> > > > > > > > > request
> > > > > > > > > > > the
> > > > > > > > > > > > > > > amount of memory, leaving it to JVM / OS to
> > decide
> > > > which
> > > > > > > memory
> > > > > > > > > > > > > (address)
> > > > > > > > > > > > > > > should be assigned. For GPU, and potentially
> > other
> > > > > > > resources
> > > > > > > > > like
> > > > > > > > > > > > > FPGA, we
> > > > > > > > > > > > > > > need to explicitly specify which GPU (index)
> > should
> > > > be
> > > > > > > used.
> > > > > > > > > > > > > Therefore, we
> > > > > > > > > > > > > > > need some component at the TM level to coordinate
> > > > which
> > > > > > > slot
> > > > > > > > > uses
> > > > > > > > > > > > which
> > > > > > > > > > > > > > > GPU.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IMO, unless we say Flink will not support
> > slot-level
> > > > GPU
> > > > > > > > > slicing at
> > > > > > > > > > > > > least
> > > > > > > > > > > > > > > in the foreseeable future, I don't see a good
> > way to
> > > > > > avoid
> > > > > > > > > touching
> > > > > > > > > > > > > the TM
> > > > > > > > > > > > > > > core. To that end, I think Becket's suggestion
> > > > points to
> > > > > > a
> > > > > > > good
> > > > > > > > > > > > > direction,
> > > > > > > > > > > > > > > that supports more features (GPU, FPGA, etc.)
> > with
> > > > less
> > > > > > > > > coupling to
> > > > > > > > > > > > > the TM
> > > > > > > > > > > > > > > core (only needs to understand the general
> > > > interfaces).
> > > > > > The
> > > > > > > > > > > detailed
> > > > > > > > > > > > > > > implementation for specific resource types can
> > even
> > > > be
> > > > > > > > > encapsulated
> > > > > > > > > > > > as
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > library.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for sharing your thought on the final
> > state.
> > > > > > > Despite the
> > > > > > > > > > > > > details how
> > > > > > > > > > > > > > > the interfaces should look like, I think this is
> > a
> > > > really
> > > > > > > good
> > > > > > > > > > > > > abstraction
> > > > > > > > > > > > > > > for supporting general resource types.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I'd like to further clarify that, the following
> > three
> > > > > > > things
> > > > > > > > > are
> > > > > > > > > > > all
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > the "Flink core" needs to understand.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > - The *amount* of resource, for scheduling.
> > > > Actually,
> > > > > > we
> > > > > > > > > already
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > > the Resource class in ResourceProfile and
> > > > ResourceSpec
> > > > > > > for
> > > > > > > > > > > > extended
> > > > > > > > > > > > > > > resource. It's just not really used.
> > > > > > > > > > > > > > > - The *info*, that Flink provides to the
> > > > operators /
> > > > > > > user
> > > > > > > > > codes.
> > > > > > > > > > > > > > > - The *provider*, which generates the info
> > based
> > > > on
> > > > > > the
> > > > > > > > > amount.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The "core" does not need to understand the
> > specific
> > > > > > > > > implementation
> > > > > > > > > > > > > details
> > > > > > > > > > > > > > > of the above three. They can even be implemented
> > in a
> > > > > > > 3rd-party
> > > > > > > > > > > > > library.
> > > > > > > > > > > > > > > Similar to how we allow users to define their
> > custom
> > > > > > > > > > > MetricReporter.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Mar 23, 2020 at 8:45 AM Becket Qin <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for the comment, Stephan.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > - If everything becomes a "core feature", it
> > will
> > > > > > make
> > > > > > > the
> > > > > > > > > > > > project
> > > > > > > > > > > > > hard
> > > > > > > > > > > > > > > > > to develop in the future. Thinking "library"
> > /
> > > > > > > "plugin" /
> > > > > > > > > > > > > "extension"
> > > > > > > > > > > > > > > > style
> > > > > > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Completely agree. It is much more important to
> > > > design a
> > > > > > > > > mechanism
> > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > focusing on a specific case. Here is what I am
> > > > thinking
> > > > > > > to
> > > > > > > > > fully
> > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > custom resource management:
> > > > > > > > > > > > > > > > 1. On the JM / RM side, use ResourceProfile and
> > > > > > > ResourceSpec
> > > > > > > > > to
> > > > > > > > > > > > > define
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > resource and the amount required. They will be
> > > > used to
> > > > > > > find
> > > > > > > > > > > > suitable
> > > > > > > > > > > > > TMs
> > > > > > > > > > > > > > > > slots to run the tasks. At this point, the
> > > > resources
> > > > > > are
> > > > > > > only
> > > > > > > > > > > > > measured by
> > > > > > > > > > > > > > > > amount, i.e. they do not have individual ID.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2. On the TM side, have something like
> > > > > > > > > *"ResourceInfoProvider"*
> > > > > > > > > > > to
> > > > > > > > > > > > > > > identify
> > > > > > > > > > > > > > > > and provides the detail information of the
> > > > individual
> > > > > > > > > resource,
> > > > > > > > > > > > e.g.
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > ID.. It is important because the operator may
> > have
> > > > to
> > > > > > > > > explicitly
> > > > > > > > > > > > > interact
> > > > > > > > > > > > > > > > with the physical resource it uses. The
> > > > > > > ResourceInfoProvider
> > > > > > > > > > > might
> > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > like something below.
> > > > > > > > > > > > > > > > interface ResourceInfoProvider<INFO> {
> > > > > > > > > > > > > > > > Map<AbstractID, INFO>
> > > > > > retrieveResourceInfo(OperatorId
> > > > > > > > > opId,
> > > > > > > > > > > > > > > > ResourceProfile resourceProfile);
> > > > > > > > > > > > > > > > }
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > - There could be several
> > "*ResourceInfoProvider*"
> > > > > > > configured
> > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > > > TM to
> > > > > > > > > > > > > > > > retrieve the information for different
> > resources.
> > > > > > > > > > > > > > > > - The TM will be responsible to assign those
> > > > individual
> > > > > > > > > resources
> > > > > > > > > > > > to
> > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > operator according to their requested amount.
> > > > > > > > > > > > > > > > - The operators will be able to get the
> > > > ResourceInfo
> > > > > > from
> > > > > > > > > their
> > > > > > > > > > > > > > > > RuntimeContext.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If we agree this is a reasonable final state.
> > We
> > > > can
> > > > > > > adapt
> > > > > > > > > the
> > > > > > > > > > > > > current
> > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > to it. In fact it does not sound a big change
> > to
> > > > me.
> > > > > > All
> > > > > > > the
> > > > > > > > > > > > proposed
> > > > > > > > > > > > > > > > configuration can be as is, it is just that
> > Flink
> > > > > > itself
> > > > > > > > > won't
> > > > > > > > > > > care
> > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > them, instead a GPUInfoProviver implementing
> > the
> > > > > > > > > > > > ResourceInfoProvider
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > use them.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, Mar 23, 2020 at 1:47 AM Stephan Ewen <
> > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi all!
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The main point I wanted to throw into the
> > > > discussion
> > > > > > > is the
> > > > > > > > > > > > > following:
> > > > > > > > > > > > > > > > > - With more and more use cases, more and
> > more
> > > > tools
> > > > > > > go
> > > > > > > > > into
> > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > - If everything becomes a "core feature",
> > it
> > > > will
> > > > > > > make
> > > > > > > > > the
> > > > > > > > > > > > > project
> > > > > > > > > > > > > > > hard
> > > > > > > > > > > > > > > > > to develop in the future. Thinking "library"
> > /
> > > > > > > "plugin" /
> > > > > > > > > > > > > "extension"
> > > > > > > > > > > > > > > > style
> > > > > > > > > > > > > > > > > where possible helps.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - A good thought experiment is always: How
> > many
> > > > > > > future
> > > > > > > > > > > > developers
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > interact with this code (and possibly
> > understand
> > > > it
> > > > > > > > > partially),
> > > > > > > > > > > > > even if
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > features they touch have nothing to do with
> > GPU
> > > > > > > support. If
> > > > > > > > > > > many
> > > > > > > > > > > > > > > > > contributors to unrelated features will have
> > to
> > > > touch
> > > > > > > it
> > > > > > > > > and
> > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > it,
> > > > > > > > > > > > > > > > > then let's think if there is a different
> > > > solution.
> > > > > > > Maybe
> > > > > > > > > there
> > > > > > > > > > > is
> > > > > > > > > > > > > not,
> > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > then we should be sure why.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > - That led me to raising this issue: If
> > the GPU
> > > > > > > manager
> > > > > > > > > > > > becomes a
> > > > > > > > > > > > > > > core
> > > > > > > > > > > > > > > > > service in the TaskManager, Environment,
> > > > > > > RuntimeContext,
> > > > > > > > > etc.
> > > > > > > > > > > > then
> > > > > > > > > > > > > > > > everyone
> > > > > > > > > > > > > > > > > developing TM and streaming tasks need to
> > > > understand
> > > > > > > the
> > > > > > > > > GPU
> > > > > > > > > > > > > manager.
> > > > > > > > > > > > > > > > That
> > > > > > > > > > > > > > > > > seems oddly specific, is my impression.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Access to configuration seems not the right
> > > > reason to
> > > > > > > do
> > > > > > > > > that.
> > > > > > > > > > > We
> > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > expose the Flink configuration from the
> > > > > > RuntimeContext
> > > > > > > > > anyways.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If GPUs are sliced and assigned during
> > > > scheduling,
> > > > > > > there
> > > > > > > > > may be
> > > > > > > > > > > > > reason,
> > > > > > > > > > > > > > > > > although it looks that it would belong to the
> > > > slot
> > > > > > > then. Is
> > > > > > > > > > > that
> > > > > > > > > > > > > what
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > are doing here?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Mar 20, 2020 at 2:58 AM Xintong Song
> > <
> > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks for the feedback, Becket.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > IMO, eventually an operator should only see
> > > > info of
> > > > > > > GPUs
> > > > > > > > > that
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > dedicated
> > > > > > > > > > > > > > > > > > for it, instead of all GPUs on the
> > > > > > machine/container
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > It does not make sense to let the user who
> > > > writes a
> > > > > > > UDF
> > > > > > > > > to
> > > > > > > > > > > > worry
> > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > coordination among multiple operators
> > running
> > > > on
> > > > > > the
> > > > > > > same
> > > > > > > > > > > > > machine.
> > > > > > > > > > > > > > > And
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > we want to limit the GPU info an operator
> > > > sees, we
> > > > > > > > > should not
> > > > > > > > > > > > > let the
> > > > > > > > > > > > > > > > > > operator to instantiate GPUManager, which
> > > > means we
> > > > > > > have
> > > > > > > > > to
> > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > something
> > > > > > > > > > > > > > > > > > through runtime context, either GPU info or
> > > > some
> > > > > > > kind of
> > > > > > > > > > > > limited
> > > > > > > > > > > > > > > access
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > the GPUManager.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, Mar 19, 2020 at 5:48 PM Becket Qin
> > <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > It probably make sense for us to first
> > agree
> > > > on
> > > > > > the
> > > > > > > > > final
> > > > > > > > > > > > > state.
> > > > > > > > > > > > > > > More
> > > > > > > > > > > > > > > > > > > specifically, will the resource info be
> > > > exposed
> > > > > > > through
> > > > > > > > > > > > runtime
> > > > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > > > > eventually?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > If that is the final state and we have a
> > > > seamless
> > > > > > > > > migration
> > > > > > > > > > > > > story
> > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > FLIP to that final state, Personally I
> > think
> > > > it
> > > > > > is
> > > > > > > OK
> > > > > > > > > to
> > > > > > > > > > > > > expose the
> > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > info in the runtime context.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Mon, Mar 16, 2020 at 11:21 AM Xintong
> > > > Song <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > @Yangze,
> > > > > > > > > > > > > > > > > > > > I think what Stephan means (@Stephan,
> > > > please
> > > > > > > correct
> > > > > > > > > me
> > > > > > > > > > > if
> > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > wrong)
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > that, we might not need to hold and
> > > > maintain
> > > > > > the
> > > > > > > > > > > GPUManager
> > > > > > > > > > > > > as a
> > > > > > > > > > > > > > > > > > service
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > TaskManagerServices or RuntimeContext.
> > An
> > > > > > > > > alternative is
> > > > > > > > > > > to
> > > > > > > > > > > > > > > create
> > > > > > > > > > > > > > > > /
> > > > > > > > > > > > > > > > > > > > retrieve the GPUManager only in the
> > > > operators
> > > > > > > that
> > > > > > > > > need
> > > > > > > > > > > it,
> > > > > > > > > > > > > e.g.,
> > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > static method `GPUManager.get()`.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > @Stephan,
> > > > > > > > > > > > > > > > > > > > I agree with you on excluding
> > GPUManager
> > > > from
> > > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > - For the first step, where we
> > provide
> > > > > > unified
> > > > > > > > > > > TM-level
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > > > to all operators, it should be fine
> > to
> > > > have
> > > > > > > > > operators
> > > > > > > > > > > > > access /
> > > > > > > > > > > > > > > > > > > > lazy-initiate GPUManager by
> > themselves.
> > > > > > > > > > > > > > > > > > > > - In future, we might have some more
> > > > > > > fine-grained
> > > > > > > > > GPU
> > > > > > > > > > > > > > > > management,
> > > > > > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > > > > we need to maintain GPUManager as a
> > > > service
> > > > > > > and
> > > > > > > > > put
> > > > > > > > > > > GPU
> > > > > > > > > > > > > info
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > slot
> > > > > > > > > > > > > > > > > > > > profiles. But at least for now it's
> > not
> > > > > > > necessary
> > > > > > > > > to
> > > > > > > > > > > > > introduce
> > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > complexity.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > However, I have some concerns on
> > excluding
> > > > > > > GPUManager
> > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > > > > > > > > and let operators access it directly.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > - Configurations needed for
> > creating the
> > > > > > > > > GPUManager is
> > > > > > > > > > > > not
> > > > > > > > > > > > > > > > always
> > > > > > > > > > > > > > > > > > > > available for operators.
> > > > > > > > > > > > > > > > > > > > - If later we want to have
> > fine-grained
> > > > > > > control
> > > > > > > > > over
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > (e.g.,
> > > > > > > > > > > > > > > > > > > > operators in each slot can only see
> > GPUs
> > > > > > > reserved
> > > > > > > > > for
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > slot),
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > approach cannot be easily extended.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I would suggest to wrap the GPUManager
> > > > behind
> > > > > > > > > > > > RuntimeContext
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > expose the GPUInfo to users. For now,
> > we
> > > > can
> > > > > > > declare
> > > > > > > > > a
> > > > > > > > > > > > method
> > > > > > > > > > > > > > > > > > > > `getGPUInfo()` in RuntimeContext, with
> > a
> > > > > > default
> > > > > > > > > > > definition
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > calls
> > > > > > > > > > > > > > > > > > > > `GPUManager.get()` to get the
> > > > lazily-created
> > > > > > > > > GPUManager.
> > > > > > > > > > > If
> > > > > > > > > > > > > later
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > > > > > > to create / retrieve GPUManager in a
> > > > different
> > > > > > > way,
> > > > > > > > > we
> > > > > > > > > > > can
> > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > > > > how `getGPUInfo` is implemented,
> > without
> > > > > > needing
> > > > > > > to
> > > > > > > > > > > change
> > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 10:09 AM Yangze
> > > > Guo <
> > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > @Shephan
> > > > > > > > > > > > > > > > > > > > > Do you mean Minicluster? Yes, it
> > makes
> > > > sense
> > > > > > to
> > > > > > > > > share
> > > > > > > > > > > the
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > Manager
> > > > > > > > > > > > > > > > > > > > > in such scenario.
> > > > > > > > > > > > > > > > > > > > > If that's what you worry about, I'm
> > +1
> > > > for
> > > > > > > holding
> > > > > > > > > > > > > > > > > > > > > GPUManager(ExternalResourceManagers)
> > in
> > > > > > > > > TaskExecutor
> > > > > > > > > > > > > instead of
> > > > > > > > > > > > > > > > > > > > > TaskManagerServices.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Regarding the
> > > > RuntimeContext/FunctionContext,
> > > > > > > it
> > > > > > > > > just
> > > > > > > > > > > > > holds the
> > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > info instead of the GPU Manager.
> > AFAIK,
> > > > it's
> > > > > > > the
> > > > > > > > > only
> > > > > > > > > > > > > place we
> > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > > pass GPU info to the
> > > > > > > > > RichFunction/UserDefinedFunction.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Sat, Mar 14, 2020 at 4:06 AM Isaac
> > > > > > Godfried
> > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > ---- On Fri, 13 Mar 2020 15:58:20
> > +0000
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > wrote
> > > > > > > > > > > > > > > > > > ----
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Can we somehow keep this out
> > of the
> > > > > > > > > TaskManager
> > > > > > > > > > > > > services
> > > > > > > > > > > > > > > > > > > > > > > I fear that we could not. IMO,
> > the
> > > > > > > > > GPUManager(or
> > > > > > > > > > > > > > > > > > > > > > > ExternalServicesManagers in
> > future)
> > > > is
> > > > > > > > > conceptually
> > > > > > > > > > > > > one of
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > > > > > manager services, just like
> > > > MemoryManager
> > > > > > > > > before
> > > > > > > > > > > > 1.10.
> > > > > > > > > > > > > > > > > > > > > > > - It maintains/holds the GPU
> > > > resource at
> > > > > > TM
> > > > > > > > > level
> > > > > > > > > > > and
> > > > > > > > > > > > > all
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > operators allocate the GPU
> > resources
> > > > from
> > > > > > > it.
> > > > > > > > > So,
> > > > > > > > > > > it
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > > exclusive to a single
> > TaskExecutor.
> > > > > > > > > > > > > > > > > > > > > > > - We could add a collection
> > called
> > > > > > > > > > > > > ExternalResourceManagers
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > hold
> > > > > > > > > > > > > > > > > > > > > > > all managers of other external
> > > > resources
> > > > > > > in the
> > > > > > > > > > > > future.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Can you help me understand why this
> > > > needs
> > > > > > the
> > > > > > > > > > > addition
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > TaskMagerServices
> > > > > > > > > > > > > > > > > > > > > > or in the RuntimeContext?
> > > > > > > > > > > > > > > > > > > > > > Are you worried about the case when
> > > > > > multiple
> > > > > > > Task
> > > > > > > > > > > > > Executors
> > > > > > > > > > > > > > > run
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > > JVM? That's not common, but
> > wouldn't it
> > > > > > > actually
> > > > > > > > > be
> > > > > > > > > > > > good
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > share the GPU Manager, given that
> > the
> > > > GPU
> > > > > > is
> > > > > > > > > shared?
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > ---------------------------
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > What parts need information about
> > > > this?
> > > > > > > > > > > > > > > > > > > > > > > In this FLIP, operators need the
> > > > > > > information.
> > > > > > > > > Thus,
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > information to the
> > > > > > > > > RuntimeContext/FunctionContext.
> > > > > > > > > > > > The
> > > > > > > > > > > > > slot
> > > > > > > > > > > > > > > > > > profile
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > not aware of GPU resources as
> > GPU is
> > > > TM
> > > > > > > level
> > > > > > > > > > > > resource
> > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Can the GPU Manager be a "self
> > > > > > contained"
> > > > > > > > > thing
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > configuration, and then abstracts
> > > > > > > everything
> > > > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > > > > > > Yes, we just pass the path/args
> > of
> > > > the
> > > > > > > discover
> > > > > > > > > > > > script
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > > > GPUs per TM to it. It takes the
> > > > > > > responsibility
> > > > > > > > > to
> > > > > > > > > > > get
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > information and expose them to
> > the
> > > > > > > > > > > > > > > > > RuntimeContext/FunctionContext
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > > Operators. Meanwhile, we'd
> > better not
> > > > > > allow
> > > > > > > > > > > operators
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > directly
> > > > > > > > > > > > > > > > > > > > > > > access GPUManager, it should get
> > what
> > > > > > they
> > > > > > > want
> > > > > > > > > > > from
> > > > > > > > > > > > > > > Context.
> > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > > > > then decouple the
> > > > > > interface/implementation
> > > > > > > of
> > > > > > > > > > > > > GPUManager
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > Public
> > > > > > > > > > > > > > > > > > > > > > > API.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Fri, Mar 13, 2020 at 7:26 PM
> > > > Stephan
> > > > > > > Ewen <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > It sounds fine to initially
> > start
> > > > with
> > > > > > > GPU
> > > > > > > > > > > specific
> > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > > > > generalizing this once we
> > better
> > > > > > > understand
> > > > > > > > > the
> > > > > > > > > > > > > space.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > About the implementation
> > suggested
> > > > in
> > > > > > > > > FLIP-108:
> > > > > > > > > > > > > > > > > > > > > > > > - Can we somehow keep this out
> > of
> > > > the
> > > > > > > > > TaskManager
> > > > > > > > > > > > > > > services?
> > > > > > > > > > > > > > > > > > > > Anything
> > > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > > have to pull through all
> > layers of
> > > > the
> > > > > > TM
> > > > > > > > > makes
> > > > > > > > > > > the
> > > > > > > > > > > > > TM
> > > > > > > > > > > > > > > > > > components
> > > > > > > > > > > > > > > > > > > > yet
> > > > > > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > > > > complex and harder to maintain.
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > - What parts need information
> > about
> > > > > > this?
> > > > > > > > > > > > > > > > > > > > > > > > -> do the slot profiles need
> > > > > > information
> > > > > > > > > about
> > > > > > > > > > > the
> > > > > > > > > > > > > GPU?
> > > > > > > > > > > > > > > > > > > > > > > > -> Can the GPU Manager be a
> > "self
> > > > > > > contained"
> > > > > > > > > > > thing
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > > > > takes
> > > > > > > > > > > > > > > > > > > > > > > > the configuration, and then
> > > > abstracts
> > > > > > > > > everything
> > > > > > > > > > > > > > > > internally?
> > > > > > > > > > > > > > > > > > > > > Operators
> > > > > > > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > > > > access it via
> > "GPUManager.get()"
> > > > or so?
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 4:19 AM
> > > > Yangze
> > > > > > > Guo <
> > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Thanks for all the feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > @Becket
> > > > > > > > > > > > > > > > > > > > > > > > > Regarding the WebUI and
> > GPUInfo,
> > > > > > you're
> > > > > > > > > right,
> > > > > > > > > > > > > I'll add
> > > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > Public API section.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > @Stephan @Becket
> > > > > > > > > > > > > > > > > > > > > > > > > Regarding the general
> > extended
> > > > > > resource
> > > > > > > > > > > > mechanism,
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > second
> > > > > > > > > > > > > > > > > > > > > Xintong's
> > > > > > > > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > > > > > > > > - It's better to leverage
> > > > > > > ResourceProfile
> > > > > > > > > and
> > > > > > > > > > > > > > > > ResourceSpec
> > > > > > > > > > > > > > > > > > > after
> > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > > > supporting fine-grained GPU
> > > > > > > scheduling. As
> > > > > > > > > a
> > > > > > > > > > > > first
> > > > > > > > > > > > > step
> > > > > > > > > > > > > > > > > > > > proposal, I
> > > > > > > > > > > > > > > > > > > > > > > > > prefer to not include it in
> > the
> > > > scope
> > > > > > > of
> > > > > > > > > this
> > > > > > > > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > > > - Regarding the "Extended
> > > > Resource
> > > > > > > > > Manager",
> > > > > > > > > > > if I
> > > > > > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > > > > > > > > > > correctly, it just a code
> > > > refactoring
> > > > > > > atm,
> > > > > > > > > we
> > > > > > > > > > > > could
> > > > > > > > > > > > > > > > extract
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > open/close/allocateExtendResources of
> > > > > > > > > > > GPUManager
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > interface. If
> > > > > > > > > > > > > > > > > > > > > > > > > that is the case, +1 to do it
> > > > during
> > > > > > > > > > > > > implementation.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > @Xingbo
> > > > > > > > > > > > > > > > > > > > > > > > > As Xintong said, we looked
> > into
> > > > how
> > > > > > > Spark
> > > > > > > > > > > > supports
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > > > > Resource Scheduling" before
> > and
> > > > > > > decided to
> > > > > > > > > > > > > introduce a
> > > > > > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > > > configuration
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > >
> > > > schema(taskmanager.resource.{resourceName}.amount/discovery-script)
> > > > > > > > > > > > > > > > > > > > > > > > > to make it more extensible. I
> > > > think
> > > > > > the
> > > > > > > > > > > > "resource"
> > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > proper
> > > > > > > > > > > > > > > > > > > > > level
> > > > > > > > > > > > > > > > > > > > > > > > > to contain all the configs of
> > > > > > extended
> > > > > > > > > > > resources.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at 10:48
> > AM
> > > > > > Xingbo
> > > > > > > > > Huang <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Thanks a lot for the FLIP,
> > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > There is no doubt that GPU
> > > > resource
> > > > > > > > > > > management
> > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > greatly
> > > > > > > > > > > > > > > > > > > > > > > > > > facilitate the development
> > of
> > > > > > > AI-related
> > > > > > > > > > > > > applications
> > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > > PyFlink
> > > > > > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > I have only one comment
> > about
> > > > this
> > > > > > > wiki:
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Regarding the names of
> > several
> > > > GPU
> > > > > > > > > > > > > configurations, I
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > better
> > > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > > delete the resource field
> > > > makes it
> > > > > > > > > consistent
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > names
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > > > > > resource-related
> > > > configurations in
> > > > > > > > > > > > > TaskManagerOption.
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > taskmanager.resource.gpu.discovery-script.path
> > > > > > > > > > > > > > > ->
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > taskmanager.gpu.discovery-script.path
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Xingbo
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song <
> > > > > > [hidden email]>
> > > > > > > > > > > > > 于2020年3月4日周三
> > > > > > > > > > > > > > > > > > 上午10:39写道：
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > @Stephan, @Becket,
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Actually, Yangze, Yang
> > and I
> > > > also
> > > > > > > had
> > > > > > > > > an
> > > > > > > > > > > > > offline
> > > > > > > > > > > > > > > > > > discussion
> > > > > > > > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > > > > > making
> > > > > > > > > > > > > > > > > > > > > > > > > > > the "GPU Support" as some
> > > > general
> > > > > > > > > "Extended
> > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > Support".
> > > > > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > > > > > > > > > > > > supporting extended
> > > > resources in
> > > > > > a
> > > > > > > > > general
> > > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > definitely
> > > > > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > > > > > > > > > and extensible way. The
> > > > reason we
> > > > > > > > > propose
> > > > > > > > > > > > this
> > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > narrowing
> > > > > > > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > > > > > > > > > down to GPU alone, is
> > mainly
> > > > for
> > > > > > > the
> > > > > > > > > > > concern
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > extra
> > > > > > > > > > > > > > > > > > > efforts
> > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > review
> > > > > > > > > > > > > > > > > > > > > > > > > > > capacity needed for a
> > general
> > > > > > > > > mechanism.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > To come up with a well
> > > > design on
> > > > > > a
> > > > > > > > > general
> > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > > > > > > mechanism, we would need
> > to
> > > > > > > investigate
> > > > > > > > > > > more
> > > > > > > > > > > > > on how
> > > > > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > > > > > > > > > > > > kind of resources in
> > > > practice.
> > > > > > For
> > > > > > > > > GPU, we
> > > > > > > > > > > > > learnt
> > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > knowledge
> > > > > > > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > > experts, Becket and his
> > team
> > > > > > > members.
> > > > > > > > > But
> > > > > > > > > > > for
> > > > > > > > > > > > > FPGA,
> > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > > potential
> > > > > > > > > > > > > > > > > > > > > > > > > > > extended resources, we
> > don't
> > > > have
> > > > > > > such
> > > > > > > > > > > > > convenient
> > > > > > > > > > > > > > > > > > > information
> > > > > > > > > > > > > > > > > > > > > > > sources,
> > > > > > > > > > > > > > > > > > > > > > > > > > > making the investigation
> > > > requires
> > > > > > > more
> > > > > > > > > > > > efforts,
> > > > > > > > > > > > > > > > which I
> > > > > > > > > > > > > > > > > > > tend
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > > > not necessary atm.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > On the other hand, we
> > also
> > > > looked
> > > > > > > into
> > > > > > > > > how
> > > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > > supports a
> > > > > > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > > > > > "Custom
> > > > > > > > > > > > > > > > > > > > > > > > > > > Resource Scheduling".
> > > > Assuming we
> > > > > > > want
> > > > > > > > > to
> > > > > > > > > > > > have
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > > > > resource mechanism in the
> > > > future,
> > > > > > > we
> > > > > > > > > > > believe
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > > > > > > design can be easily
> > > > extended, in
> > > > > > > an
> > > > > > > > > > > > > incremental
> > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > > > > > too
> > > > > > > > > > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > > > > > > > > > > reworks.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > - The most important
> > part is
> > > > > > > probably
> > > > > > > > > user
> > > > > > > > > > > > > > > > interfaces.
> > > > > > > > > > > > > > > > > > > Spark
> > > > > > > > > > > > > > > > > > > > > > > offers
> > > > > > > > > > > > > > > > > > > > > > > > > > > configuration options to
> > > > define
> > > > > > the
> > > > > > > > > amount,
> > > > > > > > > > > > > > > discovery
> > > > > > > > > > > > > > > > > > > script
> > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > vendor
> > > > > > > > > > > > > > > > > > > > > > > > > > > (on
> > > > > > > > > > > > > > > > > > > > > > > > > > > k8s) in a per resource
> > type
> > > > bias
> > > > > > > [1],
> > > > > > > > > which
> > > > > > > > > > > > is
> > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > > > > > > > proposed in this FLIP. I
> > > > think
> > > > > > > it's not
> > > > > > > > > > > > > necessary
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > expose
> > > > > > > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > > > > > > in the general way atm,
> > > > since we
> > > > > > > do not
> > > > > > > > > > > have
> > > > > > > > > > > > > > > supports
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > > > > > types now. If later we
> > > > decided to
> > > > > > > have
> > > > > > > > > per
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > > > > > > > > > > options, we
> > > > > > > > > > > > > > > > > > > > > > > > > > > can have backwards
> > > > compatibility
> > > > > > > on the
> > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > proposed
> > > > > > > > > > > > > > > > > > > > > options
> > > > > > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > > > > > > simple key mapping.
> > > > > > > > > > > > > > > > > > > > > > > > > > > - For the GPU Manager, if
> > > > later
> > > > > > > needed
> > > > > > > > > we
> > > > > > > > > > > can
> > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > > > "Extended
> > > > > > > > > > > > > > > > > > > > > > > > > > > Resource Manager" (or
> > > > whatever it
> > > > > > > is
> > > > > > > > > > > called).
> > > > > > > > > > > > > That
> > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > pure
> > > > > > > > > > > > > > > > > > > > > > > > > > > component-internal
> > > > refactoring.
> > > > > > > > > > > > > > > > > > > > > > > > > > > - For ResourceProfile and
> > > > > > > ResourceSpec,
> > > > > > > > > > > there
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > > > > > fields for
> > > > > > > > > > > > > > > > > > > > > > > > > > > general extended
> > resource.
> > > > We can
> > > > > > > of
> > > > > > > > > course
> > > > > > > > > > > > > > > leverage
> > > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > > > > > > supporting
> > > > > > > > > > > > > > > > > > > > > > > > > > > fine grained GPU
> > scheduling.
> > > > That
> > > > > > > is
> > > > > > > > > also
> > > > > > > > > > > not
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > scope
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > > > > > > > > > step proposal, and would
> > > > require
> > > > > > > > > FLIP-56 to
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > finished
> > > > > > > > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > To summary up, I agree
> > with
> > > > > > Becket
> > > > > > > that
> > > > > > > > > > > have
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > > general extended resource
> > > > > > > mechanism,
> > > > > > > > > and
> > > > > > > > > > > keep
> > > > > > > > > > > > > it in
> > > > > > > > > > > > > > > > > mind
> > > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > > > > > discussing
> > > > > > > > > > > > > > > > > > > > > > > > > > > and implementing the
> > current
> > > > one.
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://spark.apache.org/docs/3.0.0-preview/configuration.html#custom-resource-scheduling-and-configuration-overview
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at
> > 9:18
> > > > AM
> > > > > > > Becket
> > > > > > > > > Qin <
> > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > That's a good point,
> > > > Stephan.
> > > > > > It
> > > > > > > > > makes
> > > > > > > > > > > > total
> > > > > > > > > > > > > > > sense
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > generalize
> > > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > > > resource management to
> > > > support
> > > > > > > custom
> > > > > > > > > > > > > resources.
> > > > > > > > > > > > > > > > > Having
> > > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > > > > > > > > to add new resources by
> > > > > > > themselves.
> > > > > > > > > The
> > > > > > > > > > > > > general
> > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > > > > > involve two different
> > > > aspects:
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The custom resource
> > type
> > > > > > > > > definition.
> > > > > > > > > > > It
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > supported
> > > > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > extended
> > > > > > > > > > > > > > > > > > > > > > > > > > > > resources in
> > > > ResourceProfile
> > > > > > and
> > > > > > > > > > > > > ResourceSpec.
> > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > likely
> > > > > > > > > > > > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > > > > > > > > > > > > majority of the cases.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. The custom resource
> > > > > > allocation
> > > > > > > > > logic,
> > > > > > > > > > > > > i.e. how
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > assign
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > > > > > > > > > > to different tasks,
> > > > operators,
> > > > > > > and
> > > > > > > > > so on.
> > > > > > > > > > > > > This
> > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > require
> > > > > > > > > > > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > > > > > > > > levels /
> > > > > > > > > > > > > > > > > > > > > > > > > > > > steps:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > a. Subtask level - make
> > > > sure
> > > > > > the
> > > > > > > > > subtasks
> > > > > > > > > > > > > are put
> > > > > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > > > > > suitable
> > > > > > > > > > > > > > > > > > > > > > > > > > > slots.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > It is done by the
> > global
> > > > RM and
> > > > > > > is
> > > > > > > > > not
> > > > > > > > > > > > > > > customizable
> > > > > > > > > > > > > > > > > > right
> > > > > > > > > > > > > > > > > > > > > now.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > b. Operator level -
> > map the
> > > > > > exact
> > > > > > > > > > > resource
> > > > > > > > > > > > > to the
> > > > > > > > > > > > > > > > > > > operators
> > > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > TM.
> > > > > > > > > > > > > > > > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > GPU 1 for operator A,
> > GPU
> > > > 2 for
> > > > > > > > > operator
> > > > > > > > > > > B.
> > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > step
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > needed
> > > > > > > > > > > > > > > > > > > > > > > > > assuming
> > > > > > > > > > > > > > > > > > > > > > > > > > > > the global RM does not
> > > > > > > distinguish
> > > > > > > > > > > > individual
> > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > > > > > type.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > It is true for memory,
> > but
> > > > not
> > > > > > > for
> > > > > > > > > GPU.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > The GPU manager is
> > > > designed to
> > > > > > > do 2.b
> > > > > > > > > > > here.
> > > > > > > > > > > > > So it
> > > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > > > > discover the
> > > > > > > > > > > > > > > > > > > > > > > > > > > > physical GPU
> > information
> > > > and
> > > > > > > > > bind/match
> > > > > > > > > > > > them
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > > Making
> > > > > > > > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > > > > > > > > general will fill in
> > the
> > > > > > missing
> > > > > > > > > piece to
> > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > custom
> > > > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > > > > > > > > definition. But I'd
> > avoid
> > > > > > > calling it
> > > > > > > > > a
> > > > > > > > > > > > > "External
> > > > > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > > > > Manager" to
> > > > > > > > > > > > > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > > > > > > > > > > > confusion with RM,
> > maybe
> > > > > > > something
> > > > > > > > > like
> > > > > > > > > > > > > "Operator
> > > > > > > > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > > > > > > > > be more accurate. So
> > for
> > > > each
> > > > > > > > > resource
> > > > > > > > > > > type
> > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > > > > > optional
> > > > > > > > > > > > > > > > > > > > > > > > > > > > "Operator Resource
> > > > Assigner" in
> > > > > > > the
> > > > > > > > > TM.
> > > > > > > > > > > For
> > > > > > > > > > > > > > > memory,
> > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > don't
> > > > > > > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > > > > > > > > this,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > but for other extended
> > > > > > resources,
> > > > > > > > > users
> > > > > > > > > > > may
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Personally I think a
> > > > pluggable
> > > > > > > > > "Operator
> > > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > achievable
> > > > > > > > > > > > > > > > > > > > > > > > > > > > in this FLIP. But I am
> > > > also OK
> > > > > > > with
> > > > > > > > > > > having
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > > > > > FLIP
> > > > > > > > > > > > > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > > > > > > > > > > > the interface between
> > the
> > > > > > > "Operator
> > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > Assigner"
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > operator
> > > > > > > > > > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > > > > > > > > take a while to settle
> > > > down if
> > > > > > we
> > > > > > > > > want to
> > > > > > > > > > > > > make it
> > > > > > > > > > > > > > > > > > > generic.
> > > > > > > > > > > > > > > > > > > > > But I
> > > > > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > > > > > > > > > > implementation should
> > take
> > > > this
> > > > > > > > > future
> > > > > > > > > > > work
> > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > > > consideration so
> > > > > > > > > > > > > > > > > > > > > > > > > that we
> > > > > > > > > > > > > > > > > > > > > > > > > > > > don't need to break
> > > > backwards
> > > > > > > > > > > compatibility
> > > > > > > > > > > > > once
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Mar 4, 2020 at
> > > > 12:27 AM
> > > > > > > > > Stephan
> > > > > > > > > > > > Ewen
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you for writing
> > > > this
> > > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > I cannot really give
> > much
> > > > > > input
> > > > > > > > > into
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > mechanics
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > GPU-aware
> > > > > > > > > > > > > > > > > > > > > > > > > > > > scheduling
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > and GPU allocation,
> > as I
> > > > have
> > > > > > > no
> > > > > > > > > > > > experience
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > that.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > One thought I had
> > when
> > > > > > reading
> > > > > > > the
> > > > > > > > > > > > > proposal is
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > > > > > > > > > sense to
> > > > > > > > > > > > > > > > > > > > > > > > > > > look
> > > > > > > > > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > the "GPU Manager" as
> > an
> > > > > > > "External
> > > > > > > > > > > > Resource
> > > > > > > > > > > > > > > > > Manager",
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > is one
> > > > > > > > > > > > > > > > > > > > > > > > > > > such
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > resource.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > The way I understand
> > the
> > > > > > > > > > > ResourceProfile
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > ResourceSpec,
> > > > > > > > > > > > > > > > > > > > > > > that is
> > > > > > > > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > is done there.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > It has the advantage
> > > > that it
> > > > > > > looks
> > > > > > > > > more
> > > > > > > > > > > > > > > > extensible.
> > > > > > > > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > > > > > > there is
> > > > > > > > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Resource, a
> > specialized
> > > > > > NVIDIA
> > > > > > > GPU
> > > > > > > > > > > > > Resource,
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > FPGA
> > > > > > > > > > > > > > > > > > > > > > > Resource, a
> > > > > > > > > > > > > > > > > > > > > > > > > > > Alibaba
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > TPU Resource, etc.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > Stephan
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3, 2020
> > at
> > > > 7:57
> > > > > > AM
> > > > > > > > > Becket
> > > > > > > > > > > > Qin <
> > > > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the FLIP
> > > > Yangze.
> > > > > > > GPU
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > management
> > > > > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > > is a
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > must-have
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > for machine
> > learning
> > > > use
> > > > > > > cases.
> > > > > > > > > > > > Actually
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > > > > > > > > > asked
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > question from the
> > > > users who
> > > > > > > are
> > > > > > > > > > > > > interested in
> > > > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > > > > > for ML.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Some quick
> > comments /
> > > > > > > questions
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > wiki.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. The WebUI /
> > REST API
> > > > > > > should
> > > > > > > > > > > probably
> > > > > > > > > > > > > also
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > mentioned in
> > > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > interface section.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2. Is the data
> > > > structure
> > > > > > that
> > > > > > > > > holds
> > > > > > > > > > > GPU
> > > > > > > > > > > > > info
> > > > > > > > > > > > > > > > > also a
> > > > > > > > > > > > > > > > > > > > > public
> > > > > > > > > > > > > > > > > > > > > > > API?
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket)
> > Qin
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Mar 3,
> > 2020 at
> > > > > > 10:15
> > > > > > > AM
> > > > > > > > > > > Xintong
> > > > > > > > > > > > > Song
> > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for
> > drafting
> > > > the
> > > > > > > FLIP
> > > > > > > > > and
> > > > > > > > > > > > > kicking
> > > > > > > > > > > > > > > off
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > discussion,
> > > > > > > > > > > > > > > > > > > > > > > > > > > Yangze.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Big +1 for this
> > > > feature.
> > > > > > > > > Supporting
> > > > > > > > > > > > > using
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > Flink
> > > > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > significant,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > especially for
> > the ML
> > > > > > > > > scenarios.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I've reviewed the
> > > > FLIP
> > > > > > wiki
> > > > > > > > > doc and
> > > > > > > > > > > > it
> > > > > > > > > > > > > > > looks
> > > > > > > > > > > > > > > > > good
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > me. I
> > > > > > > > > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > > > > > > > > it's a
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > very good first
> > step
> > > > for
> > > > > > > > > Flink's
> > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > supports.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 2,
> > 2020
> > > > at
> > > > > > > 12:06 PM
> > > > > > > > > > > > Yangze
> > > > > > > > > > > > > Guo
> > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > We would like
> > to
> > > > start
> > > > > > a
> > > > > > > > > > > discussion
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > "FLIP-108:
> > > > > > > > > > > > > > > > > > > > > > > Add
> > > > > > > > > > > > > > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > support in
> > > > Flink"[1].
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > This FLIP
> > mainly
> > > > > > > discusses
> > > > > > > > > the
> > > > > > > > > > > > > following
> > > > > > > > > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Enable user
> > to
> > > > > > > configure
> > > > > > > > > how
> > > > > > > > > > > many
> > > > > > > > > > > > > GPUs
> > > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > > > > > > executor
> > > > > > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > forward such
> > > > > > > requirements to
> > > > > > > > > the
> > > > > > > > > > > > > external
> > > > > > > > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > > > managers
> > > > > > > > > > > > > > > > > > > > > > > > > (for
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > Kubernetes/Yarn/Mesos
> > > > > > > > > setups).
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Provide
> > > > information
> > > > > > of
> > > > > > > > > > > available
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > resources
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > operators.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Key changes
> > > > proposed in
> > > > > > > the
> > > > > > > > > FLIP
> > > > > > > > > > > > are
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > follows:
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Forward GPU
> > > > resource
> > > > > > > > > > > requirements
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > Yarn/Kubernetes.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce
> > > > GPUManager
> > > > > > as
> > > > > > > > > one of
> > > > > > > > > > > > the
> > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > > manager
> > > > > > > > > > > > > > > > > > > > > > > services to
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > discover
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > and expose GPU
> > > > resource
> > > > > > > > > > > information
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > context
> > > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > > > > functions.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - Introduce the
> > > > default
> > > > > > > > > script
> > > > > > > > > > > for
> > > > > > > > > > > > > GPU
> > > > > > > > > > > > > > > > > > discovery,
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > which we
> > > > > > > > > > > > > > > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > the privilege
> > mode
> > > > to
> > > > > > > help
> > > > > > > > > user
> > > > > > > > > > > to
> > > > > > > > > > > > > > > achieve
> > > > > > > > > > > > > > > > > > > > > worker-level
> > > > > > > > > > > > > > > > > > > > > > > > > isolation
> > > > > > > > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > standalone
> > mode.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Please find
> > more
> > > > > > details
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > FLIP
> > > > > > > > > > > > > wiki
> > > > > > > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > > > > > Looking
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > forward
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > your feedbacks.
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >